Location Visakhapatnam Experience 5-6 years Notice Period 30 days Roles and Responsibilities Design & implement new components and various emerging technologies in Hadoop Eco System, and successful execution of various projects.
Integrate external data sources and create data lake / data mart. Integrate machine learning models on real-time input data stream.
Collaborate with various cross-functional teams infrastructure, network, database. Work with various teams to set up new Hadoop users, security and platform governance which should be pci-dss complaint.
Create and executive capacity planning strategy process for the Hadoop platform. Monitor job performances, file system / disk-space management, cluster & database connectivity, log files, management of backup / security and troubleshooting various user issues.
Design, implement, test and document performance benchmarking strategy for the platform as well for each use cases. Drive customer communication during critical events and participate / lead various operational improvement initiatives.
Responsible for setup, administration, and monitoring, tuning, optimizing, governing Large Scale Hadoop Cluster and Hadoop components On-Premise / Cloud to meet high availability / uptime requirements.
Sound knowledge on Python or Scala. Sound knowledge on Spark, HDFS / HIVE / HBASE Thorough understanding of Hadoop, Spark, and ecosystem components.
Must be proficient with data ingestion tools like sqoop, flume, talend, and Kafka. Candidates having knowledge on Machine Learning using Spark will be given preference.
Knowledge of Spark & Hadoop is a must. Knowledge of AWS and Google Cloud Platform and their various components is preferable.