Why We Work at Dun & Bradstreet We are at a transformational moment in our company journey - and we’re so excited about it.
Each day, we are finding new ways to strengthen our award-winning culture, and to accelerate creativity, innovation and growth.
Our purpose is to help customers improve business performance with Dun & Bradstreet’s Data Cloud and Live Business Identity, and we’re wildly passionate and committed to this purpose.
So, if you’re looking to make an immediate impact at a company that welcomes bold and diverse thinking, come join us! The Role : As a Senior Big Data Engineer, you will build and maintain Enterprise Level Data Pipelines utilizing the tools available within our Big Data Eco-System.
This will require you to work closely with data analysts and scientists, and database and systems administrators to create data solutions.
You will collaborate with business and technical teams to translate business requirements and functional specifications into innovative solutions.
This will require research, awareness, interactivity, and the ability to ask the right questions. You will also be responsible for serving as a technical expert for project teams throughout the implementation and maintenance of business and enterprise software solutions, and in addition, you will provide consultation to help ensure new and existing software solutions are developed with insight into industry best practices, strategies, and architectures and pursues professional growth.
Essential Key Responsibilities Design, build, and deploy new data pipelines within our Big Data Eco-Systems using Streamsets / Talend / Informatica BDM etc.
Document new / existing pipelines, ETL / ELT data pipelines using StreamSets, Informatica or any other ETL processing engine.
Familiarity with Data Pipelines, Data Lakes and modern Data Warehousing practices (virtual data warehouse, push down analytics etc.
Expert level programming skills on PythonExpert level programming skills on SparkCloud Based Infrastructure : AWS (and the very many services it offers) i.
e. EC2, RDS, AWS Redshift, EMR, Snowflake, Athena, PrestoDBExperience with one of the ETL Informatica, StreamSets in creation of complex parallel loads, Cluster Batch Execution and dependency creation using Jobs / Topologies / Workflows etc.
Experience in SQL and conversion of SQL stored procedures into Informatica / StreamSets, Strong exposure working with web service origins / targets / processors / executors, XML / JSON Sources and Restful API’ exposure working with relation databases DB2, Oracle & SQL Server including complex SQL constructs and DDL to Apache Airflow for scheduling jobsStrong knowledge of Big data Architecture (HDFS), Cluster installation, configuration, monitoring, cluster security, cluster resources management, maintenance and performance tuningCreate detailed designs and POCs to enable new workloads and technical capabilities on the Platform.
Work with the platform and infrastructure engineers to implement these capabilities in workloads and enable workload optimization including managing resource allocation and scheduling across multiple tenants to fulfill in planning activities, Data Science and perform activities to increase platform skills Education / Experience and Competencies Minimum 6 years of experience in ETL / ELT Technologies, preferably StreamSets / Informatica / Talend etc.
Minimum of 6 years hands-on experience with Big Data technologies e.g. Hadoop, Spark, 3+ years of experience on SparkHands on experience with Databricks is a HUGE 6 years of experience in Cloud environments, preferably AWSMinimum of 4 years working in a Big Data service delivery (or equivalent) roles focusing on the following disciplines : AWS S3 Creating Buckets / Tokens etc.
Big Data (Hadoop ecosystems / distributions e.g. Cloudera or , Databricks)Any experience with NoSQL and Graph databasesInformatica or StreamSets Data integration (ETL / ELT)Exposure to role and attribute based access controlsExposure to BI tools like Tableau, PowerBI, Looker, etc.
Hands on experience with managing solutions deployed in the Cloud, preferably on AWSExperience working in a Global company, working in a DevOps model is a plus