About the company
It is a fast-growing B2B SaaS company with teams in California and Bangalore. They have raised a pre-seed round last year and are now closing the seed round.
It expects to hit a million dollar ARR in a few months.
They are building a sales development automation platform that strings sales discovery and sales outreach together and completes the feedback loop to generate recommendations using ML.
They are going after a billion-dollar opportunity. Sales development is ripe for disruption, and customers are looking for solutions that can make their life easier.
Good opportunity to get into the company early and get substantial equity in a fast-growing VC funded B2B startup. They are very competitive w.
r.t salaries as well. Good opportunity to build teams from scratch and lead Work in collaboration with the application team and integration team to design, create, and maintain optimal data pipeline architecture and data structures for Data Lake / Data Warehouse
Work with stakeholders including the Sales, Product, and Customer Support teams to assist with data-related technical issues and support their data analytics needs
Assemble large, complex data sets from third-party vendors to meet business requirements
Identify, design, and implement internal process improvements : automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Elastic Search, MongoDB, and AWS technology
Streamline existing and introduce enhanced reporting and analysis solutions that leverage complex data sources derived from multiple internal systems
Desired Experience :
5+ years of experience in a Data Engineer role
Proficiency in Linux
Must have SQL knowledge and experience working with relational databases, query authoring (SQL) as well as familiarity with databases including Mysql, Mongo, Cassandra, and Athena
Must have experience with Python / Scala
Must have experience with Big Data technologies like Apache Spark
Must have experience with Apache Airflow
Experience with data pipeline and ETL tools like AWS Glue
Experience working with AWS cloud services : EC2, S3, RDS, Redshift