At Databricks we work on some of the most complex distributed processing and machine learning problems in the world and our customers challenge us with interesting new big data and AI use cases.
As a Senior Data Engineer at Databricks, you will shape the future of big data and the machine learning landscape for leading Fortune 500 organisations.
You will be in a customer-facing role that requires deep hands-on production expertise in Apache SparkTM and data engineering, along with a variety of knowledge of the big data ecosystem.
Weekly, you will guide our largest customers, for example implementing pipelines from data engineering through model building and deployment.
A successful Senior Data Engineer is curious, and excels in collaboration. As part of joining Databricks, you will have a direct channel to the developers of Apache Spark, Delta Lake, and MLflow, and the opportunity to present at top big data conferences.
The impact you will have :
Guide strategic customers as they implement transformational big data projects, including end-to-end development and deployment of industry-leading big data and AI applications
Use your expertise in data engineering best practices to guide customers to do the same, through building proofs of concept and prototypes, architecting solutions and even pair-programming with customer teams
Build, and validate migration of workloads from 3rd party databases and data platforms to Apache SparkTM
Promote Apache SparkTM and Databricks, Delta Lake and MLflow across the developer community through meetups and conferences
Coordinate with Account Executives, Customer Success Engineers and Solution Architects for expanding the use of Databricks platform within strategic enterprise customers weekly
Liaise closely with the AMER Professional services and Architects team during PST / CDT office hours (i.e 7 pm IST onwards)
What we look for :
Deep hands-on expertise in Apache SparkTM (Scala or Python)
5+ years experience in Design and implementation of Big Data technologies (Apache SparkTM, Hadoop ecosystem, Apache Kafka, NoSQ L databases) and familiarity with data architecture patterns (data warehouse, data lake, streaming, Lambda / Kappa architecture)
5+ years experience working as either :
Software Engineer / Data Engineer / Big Data Engineer : query tuning, performance tuning, troubleshooting, and debugging Spark and / or other big data solutions.
Data Scientist / ML Engineer : model selection, model lifecycle, hyperparameter tuning, model serving, deep learning, etc.
Familiarity with a full range of data engineering and data science approaches, covering theoretical best practices and the technical applications of these methods
Experience building and deploying a range of data engineering pipelines into production, including using automation best practices for CI / CD
Familiarity with databases and analytics technologies in the industry including Data Warehousing / ETL, Relational Databases, or MPP
Experience with performance tuning, troubleshooting, and debugging SparkTM and other big data solutions
Comfortable with talking up and down the IT chain of command including directors, managers, architects and developers
Experience with cloud providers such as AWS, Azure or GCP
Familiarity with AWS / EC2 cloud deployment models (Public vs. VPC)
Travel would be 30-40% regionally
Employee's Provident Fund
Annual personal development fund
Work headphones reimbursement
Business travel insurance