1. Building highly scalable distributed systems
2. Building ML based product pipeline and product platform
3. Data mining using state-of-the-art methods
4. Build benchmark infrastructure and do A / B testing
5. Handle the deployment of ML models and awareness of optimised server configurations
6. Enhancing data collection procedures to include information that is relevant for building analytic systems
7. Processing, cleansing, and verifying the integrity of data used for analysis
8. Doing the ad-hoc analysis and presenting results in a clear manner
9. Creating automated anomaly detection systems and constant tracking of its performance
Candidate Profile :
Experience : 2 years of experience using statistical computer languages (R, Python, etc.) to manipulate data and draw insights from large data sets.
Required Skills :
1. Experience with building product platforms and designing its architecture
2. Experience in Elasticsearch, SQL, Amazon Web Service, and REST APIs
3. Proficiency in using query languages such as SQL, Hive, Pig
4. Experience with distributed data / computing tools : MapReduce, Hadoop, Hive, Spark, etc.
5. Experience using web services : Redshift, S3, etc.
6. Experience creating and using machine learning algorithms and statistics : regression, simulation, modelling, clustering, decision trees, neural networks, etc.
7. Experience working with and creating data architectures.
8. Knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.
and their real-world advantages / drawbacks..