The Opportunity :
Recently combined with Anthology, Blackboard offers the largest EdTech ecosystem on a global scale, supporting over 150 million users in 80 countries.
The companys mission is to provide dynamic, data-informed experiences to the global education community so that learners and educators can achieve their goals.
We believe in the power of a truly diverse and inclusive workforce. As we expand globally, we are committed to making diversity, inclusion, and belonging a foundational part of not only our hiring practices but who we are as a company.
For more information about our company and career opportunities, please visit www.blackboard.com.
In this role, you will be a part of the Blackboard Site Reliability team, with a focus on Blackboard Data and other product lines that are transitioning to cloud-native microservice architecture.
This is a driven, creative, and energetic team that works in a flexible and agile fashion to deliver world-class products to the education market.
By joining this team, you will become a core contributing member to Blackboards EdTech Platform initiative. This future state platform will seamlessly and uniquely deliver a revolutionized learning experience through innovation, continuous delivery, and architectural integration.
Specific responsibilities will include :
Engaging with development teams on the design, deployment, capacity needs, and operations of microservices and supporting them as they transition to production
Monitoring the availability, performance, and health of production systems in support of meeting service level objectives
Using automation and tooling to continuously improve the reliability, scalability, and velocity of services deployed on AWS
Participating in emergency incident response on-call rosters
Practicing blameless postmortems that lead to improvements in resiliency and reductions in pager fatigue
The Candidate :
Required skills / qualifications :
Experience in the fields of Computer Science, Software Engineering, or related fields
Expertise with analyzing and troubleshooting large-scale, multi-region deployments in a public cloud (e.g. AWS)
Experience with cloud deployment and management tools (e.g. Terraform, GitOps)
Experience with monitoring and alerting tools (e.g. Cloudwatch, New Relic, PagerDuty)
Ability to solve complex problems, optimize code, and automate routine tasks
Self-driven and ability to lead objectives to completion
Ability to coach junior team members
Preferred skills / qualifications :
A bachelors degree in Computer Science or related field
Experience with container technology (e.g. Kubernetes, Docker)
Demonstrable scripting experience, preferably in Python or Java
Experience with cost reporting tools (e.g. AWS Cost Explorer, Kubecost)
Experience with network and / or application security
Prior experience within the education industry and / or with e-learning technologies