ML Engineer - Hadoop/Python/PySpark (3-8 yrs) Mumbai (Analytics & Data Science)
Silverpeople
Mumbai
2d ago
source : hirist.com

About the Company :

A global conglomerate operating in 200+ countries having turnover of 50+ billion dollars is the industry's global leader, providing rapid, reliable, time-definite delivery to more than 220 countries and territories

Accountabilities :

  • Work with Data Scientists and Business Analysts to frame problems in a business context. Assist all the processes from data collection, cleaning, and preprocessing, to training models and deploying them to production.
  • Understand business objectives and developing models that help to achieve them, along with metrics to track their progress.
  • Explore and visualize data to gain an understanding of it, then identify differences in data distribution that could affect performance when deploying the model in the real world.
  • Define validation strategies, preprocess or feature engineering to be done on a given dataset and data augmentation pipelines.
  • Analyze the errors of the model and design strategies to overcome them.
  • Collaborate with data engineers to build data and model pipelines, manage the infrastructure and data pipelines needed to bring code to production and demonstrate end-to-end understanding of applications (including, but not limited to, the machine learning algorithms) being created.
  • Qualifications & Specifications :

  • Bachelor's / Master's degree in Engineering / Computer Science / Math / Statistics or equivalent.
  • Experience of machine learning algorithms and libraries
  • Understanding of data structures, data modeling and software architecture.
  • Deep knowledge of math, probability, statistics and algorithms
  • Experience with machine learning platforms such as Microsoft Azure, Google Cloud, IBM Watson, and Amazon
  • Big data environment : Hadoop, Spark
  • Programming languages : Python, R, PySpark
  • Supervised & Unsupervised machine learning : linear regression, logistic regression, k-means clustering, ensemble models, random forest, svm, gradient boosting
  • Sampling data : bagging & boosting, bootstrapping
  • Neural networks : ANN, CNN, RNN related topics
  • Deep learning : Keras, Tensorflow
  • Experience with AWS Sagemaker deployment and agile methodology
  • Report this job
    checkmark

    Thank you for reporting this job!

    Your feedback will help us improve the quality of our services.

    Apply
    My Email
    By clicking on "Continue", I give neuvoo consent to process my data and to send me email alerts, as detailed in neuvoo's Privacy Policy . I may withdraw my consent or unsubscribe at any time.
    Continue
    Application form