Role Brief :
Our Big Data capability team needs hands-on developers who can produce beautiful & functional code to solve complex analytics problems.
If you are an exceptional developer with an aptitude to learn and implement using new technologies, and who loves to push the boundaries to solve complex business problems innovatively, then we would like to talk with you.
You would be responsible for evaluating, developing, maintaining and testing big data solutions for advanced analytics projects
The role would involve big data pre-processing & reporting workflows including collecting, parsing, managing, analyzing and visualizing large sets of data to turn information into business insights
The role would also involve testing various machine learning models on Big Data, and deploying learned models for ongoing scoring and prediction.
An appreciation of the mechanics of complex machine learning algorithms would be a strong advantage.
Qualification & Experience :
4 to 9+ years of demonstrable experience designing technological solutions to complex data problems, developing & testing modular, reusable, efficient and scalable code to implement those solutions.
Ideally, this would include work on the following technologies :
Expert-level proficiency in at-least one of Java or Python. Scala knowledge a strong advantage
Strong understanding and experience in distributed computing frameworks, particularly Apache Hadoop (YARN, MR, HDFS) and associated technologies (one or more of) Hive, Sqoop, Avro, Flume, Oozie, Zookeeper, etc
Hands-on experience with Apache Spark and its components (Streaming, SQL, MLLib, RDD, Dataframes, Core) is a strong advantage
Operating knowledge of cloud computing platforms (AWS / Azure / GCP)
Experience working within a Linux computing environment & use of command line tools including knowledge of Shell / Python scripting for automating common tasks
Ability to work in a team in an agile setting, familiarity with JIRA and clear understanding of how Git works
In addition, the ideal candidate would have great problem-solving skills, and the ability & confidence to hack their way out of tight corners.
Must Have (hands-on) experience :
Scala or Python expertise
Distributed computing frameworks (Hadoop or Spark)
Cloud computing platforms (AWS / Azure / GCP)
Linux environment and shell scripting
Desirable (would be a plus) :
Statistical or machine learning DSL like R
Distributed and low latency (streaming) application architecture
Row store distributed DBMSs such as Cassandra
Familiarity with API design
Education : B.E / B.Tech in Computer Science or related technical degree