Sr.Data Engineers are the data professionals who prepare the - big data- infrastructure to be analyzed by Data Scientists.
They are software engineers who design, build, integrate data from various resources, and manage big data. -
She / He is the guarantor of quality access to data sources.
She / He is responsible for management of the data and is guarantor of the quality of its use (referencing, standardisation and qualification) in order to facilitate its use by the teams (Data Analysts and Data Scientists).
She / He also contributes to the elaboration of the data policy and the structuring of its life cycle within the regulatory framework in force, in collaboration with the Chief Data Officer.
Her / His intervention scope centres on application systems in the data management and processing domain, and on platforms such as Big Data, IoT, etc.
She / He is responsible for overseeing and integrating data of a variety of types originating from these different sources and confirms the quality of the Data entering the Data Lake (she / he receives data, deletes duplicates, etc.).
Captures the structured and unstructured data produced within different applications or outside the entity
Integrates and map the components
Structures the data
Cleans up the data (deleting duplicates, etc.)
Validates the data
Where appropriate, he creates the data repository
Create and maintain robust and stable data pipeline architecture
Ability to work with large and complex data sets to meet business requirements.
Identify, design, and implement internal process improvements : automating manual processes, optimising data delivery, re-designing infrastructure for better scalability, etc.
Build, test, deploy data / AI products.
Make suggestions to architecture group on infrastructure requirements for data acquisition from a disparate data sources.
Partner with Data Science group to integrate AI / ML algorithms (written by another team) into an enterprise product.
Partner with Product Management and Data Architecture to ensure alignments between AI work & business objectives.
API development for the consumption of these models
Fair understanding of DevOps and setting up CI / CD
To fit the bill you should have :
7 to 12 years of Big Data Experience
End to End Hadoop Implementation
Understanding of setting up CI / CD concepts
Any cloud Exposure
Working with architects / Data Scientist or AI team with current or previous role
Minimum 4 years of coding experience in Java / Python / pyspark / scala
Working on AI technologies