Role Title : Sr. Reliability Engineer / Unix - Linux
About Team :
The PSG SAP technical delivery team is tasked with providing best-in-class IT support by applying standard SAP basis methodologies.
The SAP Basis Analyst will be responsible is ensure our SAP environments operating within norms and be responsible for monitoring system performance and batch jobs, execute system copies, transporting developments in a ECC Environment
Role Purpose : A Sr. Reliability Engineer is responsible for optimizing the Enterprise Global Technical environment at Thermo Fisher Scientific.
Responsibilities would consist of root cause analysis and identifying / resolving gaps within the environment while implementing a sound / reliable solution including automation.
Reliability Engineering will build up to do AI / ML within the cloud environment promoting self-healing among other features within that realm.
This role will allow thinking outside the box and empower you to make decisions and implement with guardrails not gates.
Key Responsibilities :
Play a key role in Thermo Fisher's digital transformation. Drive company's Reliability Engineering initiative and be part of a DevOps culture, working with global platform and development teams.
Reliability EngineeringPlatform EngineeringTools and automationDevOps mindsetCloud computingNetworking
Ensure stability and integrity of high performance and high availability cloud based systems for the organization.
Design, develop and test automation workflows.
Experience building and managing in a cloud environment, preferably AWS or Azure
Key driver suggesting continuous improvement in systems operations through tools and automation
Report on overall health and optimization of cloud services to management
Participate in the definition of the roadmap for cloud services in collaboration with the Platform Engineering and architecture teams.
Drive root cause analyses, and coach others on doing them, in collaboration with software development teams
Analyze and adjust designs to assist in predicting and improving system stability.
Evaluate environment on and off premise for environmental factors, such as numbers and causes of unit failures
Monitor failure data generated by a customer using product to ascertain potential requirement info product improvement
Responsible / assist in incident and problem management of cloud platform services
Works closely with our external partners
Previous experience with Service Now and proficiency in creating workflows
Work closely with IT Security to ensure the solutions we're designing and delivering meet data security and compliance requirements
Provide regular communication to peers on areas for improvement, progress, milestones, and areas of success
Be available for scheduling for 24 / 7 oncall rotation to respond to and resolve issues
Ensure documentation and processes are well defined
Experience with Enterprise Systems management tools
Required Skill, Knowledge & Experience :
In depth understanding and experience in AWS and a cloud first / cloud only initiative
Must have a minimum of 3-5+ years of Linux experience
Must have a minimum of 1-3 years of experience in Cloud Solutions Delivery and Cloud architecture, especially public cloud platforms such as AWS, Azure, GCP.
Strong preference for AWS experience
AWS proficiency (ec2, s3, RDS, Route 53, Lambda, IAM, VPC, Security groups) and other services
Scripting languages : Python and overall Linux shell scripting skills
Experience in the Unix administration / engineering
Working knowledge in Docker
Bachelor's degree in Information Systems, Computer Science or other technical field is desired3 years of additional direct and applicable professional experience in the IT field may be substituted
Strong interpersonal and excellent documentation skills are a must
Effective problem-solving techniques, such as root cause and analysis, to resolve issues
Takes ownership of work assignments and manages them to completion
Ability to explain and champion technical concepts to a broad technical audience
Excellent customer service and communication skills required
Ability to problem solve and work wuiell in complex, ambiguous situations
Ability to present technical problems and solutions to management in a manner that can consumed and translated into appropriate program improvement requests
Preferred Skills :
Highly technical and analytical with background in systems implementations and migrations.
Hands-on experience in installing, configuring and troubleshooting UNIX / Linux based environments.
Demonstrated proficiency in physical and virtual server hardware maintenance
Experience in AWS Artificial Intelligence / Machine Learning (AI / ML) implementation / configuration / analysis is a bonus.gh