Data Engineer - ETL with Spark
Merit Group Limited
2d ago
source : Shine

Job Description

We need someone with 3- 5 years of extensive experience in Data Warehousing, ETL and Big data technologies(Hadoop, Hive, Sqoop.

etc) and 2+ years of mandatory experience in Spark with Python / Scala with more than one end- to- end implementation experience.

Roles and Responsibilities

To develop Scala or Python scripts, UDFs using both Data frames / SQL / Data sets and RDD in Spark 2.3+ for Data Aggregation, queries and writing data back into the OLTP system through Sqoop.

Should have a very good understanding of Partitions, Bucketing concepts and designed both Managed and external tables, ORC files in Hive to optimize performance.

Wrote and Implemented Spark and Scala scripts to load data from and to store data into Cassandra / Hbase / any NoSQL

Implementing SCD Type 1 and Type 2 model using Spark

Developed Oozie workflow for scheduling and orchestrating the ETL process

Experienced in performance tuning of Spark Applications for setting right Batch Interval time, the correct level of Parallelism and memory tuning

Streaming data into Elastic search for visualization using Kibana

Should have implemented the mapping parameters / variables in the mapping and the session level to increase the reusability of the code and parameterize the hardcoded values. Additional skills :

Knowledge in AWS stacks AWS Glue, S3, SQS

Exposure to Elastic Search, Solr is a plus

Exposure to NoSQL Databases Cassandra, MongoDB

Exposure to Serverless computing

Full time

Report this job

Thank you for reporting this job!

Your feedback will help us improve the quality of our services.

My Email
By clicking on "Continue", I give neuvoo consent to process my data and to send me email alerts, as detailed in neuvoo's Privacy Policy . I may withdraw my consent or unsubscribe at any time.
Application form