Job Title : Data Engineer / Devops Enterprise Bigdata Platform
Job Location : Electronic City, Bangalore.
In this role, you will be part of a growing, global team of data engineers, who collaborate in DevOps mode, in order to enable the organization with state-of-the-art technology to leverage data as an asset and to take better informed decisions.
The Life Science Data Engineering Team is responsible for designing, developing, testing, and supporting automated end-to-end data pipelines and applications on Life Science’s data management and analytics platform (Palantir Foundry, Hadoop and other components).
The Foundry platform comprises multiple different technology stacks, which are hosted on Amazon Web Services (AWS) infrastructure or on-premise own data centers.
Developing pipelines and applications on Foundry requires :
This position will be project based and may work across multiple smaller projects or a single large project utilizing an agile project methodology.
Roles & Responsibilities :
Debug problems across a full stack of Foundry and code based on Python, Pyspark, and Java
Deep knowledge of distributed file system concepts, map-reduce principles and distributed computing. Knowledge of Spark and differences between Spark and Map-Reduce.
Familiarity of encryption and security in a Hadoop cluster.
Data management / data structures
Must be proficient in technical data management tasks, i.e. writing code to read, transform and store data
XML / JSON knowledge
Experience working with REST APIs
Experience in launching spark jobs in client mode and cluster mode. Familiarity with the property settings of spark jobs and their implications to performance.
SCC / Git
Must be experienced in the use of source code control systems such as Git
Experience with developing ELT / ETL processes with experience in loading data from enterprise sized RDBMS systems such as Oracle, DB2, MySQL, etc.
Basic understanding of user authorization (Apache Ranger preferred)
Must be at able to code in Python or expert in at least one high level language such as Java, C, Scala.
Must have experience in using REST APIs
Must be an expert in manipulating database data using SQL. Familiarity with views, functions, stored procedures and exception handling.
General knowledge of AWS Stack (EC2, S3, EBS, )
IT Process Compliance
SDLC experience and formalized change controls
Working in DevOps teams, based on Agile principles (e.g. Scrum)
ITIL knowledge (especially incident, problem and change management)
Fluent English skills
Specific information related to the position :