About the Company :
If you’re thinking scale’, think bigger and don’t stop there. At Walmart IDC, we don’t just innovate, we enable transformations across stores and different channels for the Walmart experience.
Take a regular day at Walmart and match that with 260 million customers a week, 11,695 stores, under 59 banners in 28 countries and e-commerce websites in 11 countries.
That’s Walmart IDC for you.
With last fiscal year revenue of $550 billion, Walmart employs approximately 2.3 million associates worldwide. We innovate to deliver a simple and seamless experience for our customers.
Our tech talent solves the biggest and most complex problems. They drive digital transformation where data and analytics are enabling us to better serve our customers and create a digital relationship with them.
As our customers evolve and adapt, we are taking it a few notches further here. We’re changing what customers can expect from the experience of shopping, from the physical stores, to mobile social and even online;
we’re not just ready for the future of shopping, we’re creating it.
The Enterprise Item and Inventory Organization is responsible for architecture, design and delivery of all subsystems that make up the Walmart Product Catalog eco system.
This catalog drives the ecommerce and stores business across US and international markets. This eco system is large distributed platform built on technologies like Cassandra, Hadoop, spark, storm, Kafka, and elastic search.
It scales to millions of transactions per hour and handles hundreds of millions of unique SKUS. It can be largely grouped into the following
A master data source for product information, images, videos, offerings and supply / chain data
Machine learning and data science Leveraging machine learning and data science techniques to address use cases like classification, product matching, attribute extraction, title optimization, image correction, detection of offensive content, quality assessment.
A large-scale distributed data processing pipeline orchestrating transaction across micro services handling product data.
APIs and data integration layer powering digital experiences, search and analytics.
Responsible for the development and maintenance of large-scale cloud based distributed data science pipelines
Responsible for the development and maintenance of cloud based distributed machine learning platforms
Responsible for adapting machine learning algorithms (training / inference) to work in a distributed cloud-based environment
Work in an agile environment to deliver high quality software
Work with teams that are distributed across geographies
Promote and support company policies, procedures, mission, values, and standards of ethics and integrity
Work with teams that are distributed across geographies.
Bachelor’s degree with 6+ or Master’s degree with 4+ years’ experience in Electrical / Electronics’ Engineering, Computer Sciences or a STEM discipline with strong programming experience
Must have :
Strong knowledge of Python, including building data pipelines
Data structures and algorithms
Experience of development and deployment in cloud
Experience of designing REST services
Knowledge of RDBMS, MySQL, Microservices
Knowledge of Dockers, Kubernetes, distributed systems, distributed databases
A problem-solving mindset, with eagerness to learn
Good to have :
Knowledge of standard Machine Learning flows (test / train split, validation, test, inference) and Deep Learning concepts (epochs / GPUs / backprop)
Familiarity with Tensorflow / PyTorch
Knowledge of java
Knowledge of Hadoop, Hive, CI / CD, Git , Kafka
Our Ideal Candidate will have the necessary qualifications as listed above. Worked in geographically distributed teams. Have the ability to work on multiple projects and assignments.
Experience managing stakeholders and communicating effectively. Must have the attitude to thrive in a fun, fast-paced environment.