Developer 2 (Bigdata) Floors 5 8, HUB 2, Building of SEZ Towers, Karle Town Centre, Nagavara, Bangalore, Karnataka, India Full-time Company Description Epsilon is the leader in outcome-based marketing.
We enable marketing thats built on proof, not promises. Through Epsilon PeopleCloud, the marketing platform for personalizing consumer journeys with performance transparency, Epsilon helps marketers anticipate, activate and prove measurable business outcomes.
Powered by CORE ID, the most accurate and stable identity management platform representing 200 million people, Epsilons award-winning data and technology is rooted in privacy by design and underpinned by powerful AI.
With more than 50 years of experience in personalization and performance working with the worlds top brands, agencies and publishers, Epsilon is a trusted partner leading CRM, digital media, loyalty and email programs.
Positioned at the core of Publicis Groupe, Epsilon is a global company with over 8,000 employees in over 40 offices around the world.
For more information, visit epsilon.com. Follow us on Twitter at EpsilonMktg. Job Description This position is responsible for hands-on design & implementation expertise in Spark and Python (PySpark) along with other Hadoop ecosystems like HDFS, Hive, Hue, Impala, Zeppelin etc.
The purpose of position includes- Analysis, design and implementation of business requirements using SPARK & Python Cloudera Hadoop development around Big Data Solid SQL experience Development experience with PySpark & SparkSql with good analytical & debugging skills Development work for building new solutions around Hadoop and automation of operational tasks Assisting team and troubleshooting issues Responsibilities Design and development around Apache SPARK, Python and Hadoop Framework Extensive usage and experience with RDD and Data Frames within Spark Extensive experience with data analytics, and working knowledge of big data infrastructure such as various Hadoop Ecosystems like HDFS, Hive, Spark etc Should be working with gigabytes / terabytes of data and must understand the challenges of transforming and enriching such large datasets Provide effective solutions to address the business problems strategic and tactical Collaboration with team members, project managers, business analysts and business users in conceptualizing, estimating and developing new solutions and enhancements Work closely with the stake holders to define and refine the big data platform to achieve company product and business objectives Collaborate with other technology teams and architects to define and develop cross- function technology stack interactions Read, extract, transform, stage and load data to multiple targets, including Hadoop and Oracle Develop automation scripts around Hadoop framework to automate processes and existing flows around Should be able to modify existing programming / codes for new requirements Unit testing and debugging.
Perform root cause analysis (RCA) for any failed processes Document existing processes as well as analyze for potential automation and performance improvements Convert business requirements into technical design specifications and execute on them Execute new development as per design specifications and business rules / requirements Participate in code reviews and keep applications / code base in sync with version control Effective communicator, self-motivated and able to work independently but fully aligned within a team environment Qualifications Bachelors in Computer Science (or equivalent) or Masters with 3 - 6 years of experience with big data against ingestion, transformation and staging using following technologies / principles / methodologies Design and solution capabilities Rich experience with Hadoop distributed frameworks, handling large amount of big data using Apache Spark and Hadoop Ecosystems Python & Spark (SparkSQL, PySpark), HDFS, Hive, Impala, Hue, Cloudera Hadoop, Zeppelin Proficient knowledge of SQL with any RDBMS Knowledge of Oracle databases and PL / SQL Working knowledge and good experience in Unix environment and capable of Unix Shell scripts (ksh, bash) Basic Hadoop administration knowledge DevOps Knowledge is an added advantage Ability to work within deadlines and effectively prioritize and execute on tasks Strong communication skills (verbal and written) with ability to communicate across teams, internal and external at all levels Certifications Anyone of these CCA Spark and Hadoop Developer MapR Certified Spark Developer (MCSD) MapR Certified Hadoop Developer (MCHD) HDP Certified Apache Spark Developer HDP Certified Developer Preferred Skills Technical Working knowledge of Oracle databases and PL / SQL Hadoop Admin & Dev-Ops Non-Technical Good analytical thinking and problem-solving skills Ability to diagnose and troubleshoot problems quickly Motivated to learn new technologies, applications and domain Possess appetite for learning through exploration and reverse engineering Strong time management skills Ability to take full ownership of tasks and projects Behavioral Attributes Team player with excellent interpersonal skills Good verbal and written communication Possess Can-Do attitude to overcome any kind of challenges