for the Oracle IT
Content & Experience Services Team
Manager, Site Reliability Engineering
Oracle is seeking an experienced and driven SRE manager who wants to join a team of highly talented Technology and Engineering professionals who are revolutionizing the delivery of Cloud Services at Oracle.
Oracle IT provides modern enterprise services to Oracle’s internal businesses and is amid a cloud transformation driving improved agility, performance, availability and security across Oracle’s Enterprise and Development environments.
We are looking for passionate, highly motivated, uniquely skilled, engineering-first driven individuals to join our team and help build our next generation business platform.
values your contributions and is dedicated to supporting your personal development, then Oracle IT is the team for you!
As a SRE Manager, you'll lead a team and be responsible for Oracle IT global services, provide technical leadership to key projects, and empower and develop teams to do the same.
This managerial leadership role is a hands-on role combining technical expertise with strategic vision and mentoring in the field of continuous integration, continuous delivery, engineering efficiency, containerization, and engineering operations.
Lead a team of Service Reliability Engineers you focused on improving service reliability, performance and operability of Services used by Oracle Employee and Oracle Customers.
Solve complex problems related to Oracle IT services and build automation to prevent problem recurrence.
Lead by example, care for the team, and establish credibility with the quality of the team's technical execution.
Manage on-call rotations across continents, using a follow-the-sun model.
Design, write, and deliver software to improve the availability, scalability, latency, and efficiency of Oracle's services.
Has an excellent, hands-on understanding of tools and techniques required to plan, build, and run a development (DevOps) pipeline for container-based, cloud-based services.
Clearly articulate, define, and execute on pipeline strategic vision based on business and engineering objectives.
Identify opportunities and drive the implementation of automation to improve service health, manageability, reliability, and telemetry.
Ability to read, write, configure, design, and script end-to-end service telemetry, alerting and self-healing capabilities for platforms.
Authoring functional and technical documentation.
Lead initiative to design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance with authority for end-to-end performance and operability.
A deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Professional curiosity and a desire to a develop deep understanding of services and technologies is required.
Responsible for managing and supporting various corporate services like Oracle.com, Java.com, Blogs.oracle.com, Community.oracle.com etc.
Bachelor’s degree or master’s degree in Computer Science or equivalent
Excellence in verbal and written communication. Ability to communicate with all levels during critical events and be a bridge for technical discussion with non-technical people
5+ years of experience as SRE manager or equivalent relevant experience
Should have clear and concrete understanding of SRE principles
Prior experience managing or leading an infrastructure, software development or a SRE team at a larger corporation
Prior experience in defining and setting measurable success criteria for engineering roles
Prior experience working as a SRE engineer to contribute to technical discussions
Prior experience in hiring, training, and leading a team of engineers
Prior experience working with other internal teams - Engineering, Development, and product teams to successfully resolve and close support cases
Experience with cloud providers such as Oracle Cloud, AWS, or Azure
At least intermediate level Linux and networking knowledge and experience
Building Observability solutions and exposing metrics that feed SLO’s and KPI’s
Preferrable to have experience with Kubernetes and container technology
Working knowledge of Jenkins, Ansible, Packer, etc. is required.
Strong Technical background with an ability to troubleshoot issues impacting large scale service architectures and application stacks.
Experience leveraging cloud architecture, applying site reliability principles, and / or demonstrating sensitivity to operational concerns
Flexible work times to enable working with team members in other time zones
Experience in 24x7 operational support
Bonus Points : SRE certification
AWS or OCI Certified Architect or be working towards it
Detailed Description and Job Requirements
Work with a world class team to develop, implement, and support cutting edge Oracle technology.
Manages a team maintaining and / or implementing software project(s) and / or internal systems. Defines, documents and manages scope, expectations, implementation approach, deliverables and acceptance testing criteria.
Leads a specialized area which may have diverse functional elements. Frequently interacts with supervisors and / or functional peer group managers.
May interact with senior management. Demonstrated leadership skills. Detailed knowledge of several applications within a business area needed.
BA / BS degree and relevant experience.
Regular Employee Hire