Senior Big Data DevOps Engineer
PubMatic
Bangalore, India
3d ago

Responsibilities :

  • Manage large scale Hadoop cluster environments including capacity planning, cluster setup, performance tuning, monitoring and Alerting.
  • Perform proof of concepts on scaling, reliability, performance and manageability.
  • Work with core production support personnel in IT and Engineering to automate deployment and operation of the infrastructure.
  • Manage, deploy, and configure infrastructure with Ansible or other automation tool sets.

  • Monitoring Hadoop jobs and recommend optimization Job MonitoringRerun jobsJob TuningSpark Optimizations
  • Data Monitoring and Pruning
  • Creation of metrics and measures of utilization and performance.
  • Capacity planning and implementation of new / upgraded hardware and software releases as well as for storage infrastructure.
  • Ability to work well with a global team of highly motivated and skilled personnel.
  • Research and recommend innovative, and where possible, automated approaches for system administration tasks.
  • Integrating ML libraries
  • Hardware accelerations
  • SQream / Kinetica / Wallaroo monitoring and maintenance)
  • Should be able to develop and apply patches
  • Debugging Infrastructure issues (Like - Underlying network issue or Issues with the nodes)
  • Addition / replacement of Kafka cluster / consumer (Not sure if this is covered in Hardware acceleration)
  • Testing / Support of infrastructure component change (like changing the load balancer to F5).
  • Deployment during the release.
  • Help QA team with production parallel testing and performance testing.
  • Help out Dev team with POC / Adhoc execution of some of the jobs for debugging / cost analysis
  • Qualifications

  • 5 to 10 years of professional experience in Java, Scala and Python.
  • 3+ years of experience of Spark / MapReduce in production environment
  • A deep understanding of Hadoop design principals, cluster connectivity, security and the factors that affect distributed system performance.
  • Experience on Kafka, Hbase and Hortonworks is mandatory.
  • Prior experience with remote monitoring and event handling using Nagios, ELK.
  • Good collaboration & communication skills, the ability to participate in an interdisciplinary team.
  • Strong written communications and documentation experience.
  • Knowledge of best practices related to security, performance, and disaster recovery.
  • BE / BTech / BS / BCS / MCS / MCA in Computers or equivalent
  • Excellent interpersonal, written, and verbal communication skills
  • LI-MD1

    Report this job
    checkmark

    Thank you for reporting this job!

    Your feedback will help us improve the quality of our services.

    Apply
    My Email
    By clicking on "Continue", I give neuvoo consent to process my data and to send me email alerts, as detailed in neuvoo's Privacy Policy . I may withdraw my consent or unsubscribe at any time.
    Continue
    Application form