You will design, execute and build tools for online experiments (A / B tests) and offline experiments (human relevance judgement) that help us improve and fine tune our data-
focused features (Search, Recommendations, etc.).
Your primary focus will be to automate the delivery of various datasets by working with Data Scientists to understand significant metrics and how they are derived.
You will write and maintain the code that ingests, computes and organizes various data sets.
At least 1-3 years’ practical experience with Big Data systems, ELT, data processing, and analytics tools.
1. Distributed computing experience using tools such as Hadoop and Spark.
2. Proficiency in using query languages such as SQL, Hive and SparkSQL.
3. Experience in any of the ELT platforms Informatica, Talend etc
4. Experience with entity-relationship modeling and understanding of normalization.
5. Familiar with the concepts of dimensional modeling.
6. Experience maintaining a Data Lake.
7. Experience writing a test suite.
8. Experience with Scala, Python.
9. Experience with data visualization tools
10. Able to understand various data structures and common methods in data transformation.
11. Keep up-to-date with newest technology trends.
12. Experience developing Big Data / Hadoop applications using java, Spark, hive, oozie, Kafka
13. Experience with Continuous Integration.
14. Experience with Version Control such as git.