Senior Site Reliability Engineer-20000O01Applicants are required to read, write, and speak the following languages : English
Oracle IT (OIT) group in Oracle Cloud Infrastructure (OCI) organization is seeking a motivated Senior Site Reliability Engineer, with an emphasis on storage technology, that thrives in a fast-paced rapidly evolving technology environment.
This individual will be a member of the SRE Storage Infrastructure services team and focused on driving for those quality standards across all of OIT.
As part of the Operational Engagement programs you will be instrumental in fostering a culture of SRE for horizontal activities and DevOps for products and tools across our global operations teams.
The team you work in will have diverse expertise in systems, networking, and software development to provide the stability, performance and reliability our customers need.
We work with multiple service development teams, identifying cross-team issues which create risk for operations across the organization and resolving those issues with a mixture of engineering, troubleshooting expertise, and general operational guidance.
Your role also requires communication and organizational skills : you are an interface between Devops Tools, application teams that implement OCI services.
You will deliver the solutions that directly contribute to our internal customer’s success.
Along side the software and tools development, you will be required to perform systems, networking automation running on virtualized and non-virtualized platforms in cloud through automation.
Other duties include researching, proofing OCI cloud services, their features for improving operations and authoring technical documentation that are beneficial to the company and the team.
What will you do
8+ years of experience in four or more of the following
Certifications Preferred if any
Education (Preferred Degree)
1 : Bias for Action
Evaluates acts and communicates in SLA time. Is decisive. Makes timely, practical, effective decisions. Takes initiative without being asked.
Plans efficiently while avoiding analysis paralysis. Knows how to take smart risks. Demonstrate strong follow-through and consistently keep commitments to customers and employees.
Take ownership and responsibility for priority customer issues where and when required review urgent and critical incidents for quality.
2 : Prioritization
Ability to prioritize the assignments at hand even in loosely structured situations. Effectively handles multiple projects or tasks at the same time and complete them within a set time frame.
3 : Self development and teaching
Understands personal strengths and development needs. Initiates self-development actions. Seeks and shares job-relevant learning, developmental experiences, and feedback to enhance performance.
Encourages others to take personal responsibility for continual learning and skill growth. Shares knowledge with others.
4 : Dealing with ambiguity
Able to function well in loosely structured situations. Works effectively in situations involving uncertainty or lack of information.
Effectively handles multiple projects or tasks at the same time. Is open to and responds flexibly to change.
5 : Teamwork and willingness to roll up sleeves
Fosters cross-functional and cross business teamwork. Builds and promotes team morale. Works efficiently and effectively on teams to meet customers' needs.
Contributes outside the scope of the job. Meets all team commitments. Consistent effort, intense commitment, and willingness to go above and beyond when needed.
Willing to do low profile, non-challenging work to get the project done.
Special Requirements : Successful candidates might be required to perform on-call duty on rotational bases.
Detailed Description and Job Requirements
Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services.
Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.
Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and / or technology areas.
Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services.
Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance.
Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture.
Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio.
Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack.
Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs).
Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations.
Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.
A BS or MS in Computer Science, or equivalent. Identifies solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance.
Experience running large scale customer facing web services. Identifies solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies.
Work involves defining and documenting technical architecture of complex and highly scalable products. A minimum of 5+ years experience of running large scale customer facing web services.