Lead the design and implementation of data architecture and pipelines for AI projects. Ensure data quality and governance while providing technical leadership and collaboration across teams.
Our Purpose
Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.
Title and Summary
Lead Data Engineer
Lead Data Engineer - Foundry R&D
We are seeking a Lead Data Engineer to join Mastercard Foundry R&D. You will help shape our innovation roadmap by exploring new technologies and building scalable, data-driven prototypes and products. The ideal candidate is hands-on, curious, adaptable, and motivated to experiment and learn.
What You'll Do
* Drive Data Architecture: Own the data architecture and modeling strategy for AI projects. Define how data is stored, organized, and accessed. Select technologies, design schemas/formats, and ensure systems support scalable AI and analytics workloads.
* Build Scalable Data Pipelines: Lead development of robust ETL/ELT workflows and data models. Build pipelines that move large datasets with high reliability and low latency to support training and inference for AI and generative AI systems.
* Ensure Data Quality & Governance: Oversee data governance and compliance with internal standards and regulations. Implement data anonymization, quality checks, lineage, and controls for handling sensitive information.
* Provide Technical Leadership: Offer hands-on leadership across data engineering projects. Conduct code reviews, enforce best practices, and promote clean, well-tested code. Introduce improvements in development processes and tooling.
* Cross-Functional Collaboration: Work closely with engineers, scientists, and product stakeholders. Scope work, manage data deliverables in agile sprints, and ensure timely delivery of data components aligned with project milestones.
What You'll Bring
* Extensive Data Engineering Experience: 8-12+ years in data engineering or backend engineering, including senior/lead roles. Experience designing end-to-end data systems, solving scale/performance challenges, integrating diverse sources, and operating pipelines in production.
* Big Data & Cloud Expertise: Strong skills in Python and/or Java/Scala. Deep experience with Spark, Hadoop, Hive/Impala, and Airflow. Hands-on work with AWS, Azure, or GCP using cloud-native processing and storage services (e.g., S3, Glue, EMR, Data Factory). Ability to design scalable, cost-efficient workloads for experimental and variable R&D environments.
* AI/ML Data Lifecycle Knowledge: Understanding of data needs for machine learning-dataset preparation, feature/label management, and supporting real-time or batch training pipelines. Experience with feature stores or streaming data is useful.
* Leadership & Mentorship: Ability to translate ambiguous goals into clear plans, guide engineers, and lead technical execution.
* Problem-Solving Mindset: Approach issues systematically, using analysis and data to select scalable, maintainable solutions.
Required Skills
* Education & Background: Bachelor's degree in Computer Science, Engineering, or related field. 8-12+ years of proven experience architecting and operating production-grade data systems, especially those supporting analytics or ML workloads.
* Pipeline Development: Expert in ETL/ELT design and implementation, working with diverse data sources, transformations, and targets. Strong experience scheduling and orchestrating pipelines using Airflow or similar tools.
* Programming & Databases: Advanced Python and/or Scala/Java skills and strong software engineering fundamentals (version control, CI, code reviews). Excellent SQL abilities, including performance tuning on large datasets.
* Big Data Technologies: Hands-on Spark experience (RDDs/DataFrames, optimization). Familiar with Hadoop components (HDFS, YARN), Hive/Impala, and streaming systems like Kafka or Kinesis.
* Cloud Infrastructure: Experience deploying data systems on AWS/Azure/GCP. Familiar with cloud data lakes, warehouses (Redshift, BigQuery, Snowflake), and cloud-based processing engines (EMR, Dataproc, Glue, Synapse). Comfortable with Linux and shell scripting.
* Data Governance & Security: Knowledge of data privacy regulations, PII handling, access controls, encryption/masking, and data quality validation. Experience with metadata management or data cataloging tools is a plus.
* Collaboration & Agile Delivery: Strong communication skills and experience working with cross-functional teams. Ability to document designs clearly and deliver iteratively using agile practices.
Preferred Skills
* Advanced Cloud & Data Platform Expertise: Experience with AWS data engineering services, Databricks, and Lakehouse/Delta Lake architectures (including bronze/silver/gold layers).
* Modern Data Stack: Familiarity with dbt, Great Expectations, containerization (Docker/Kubernetes), and monitoring tools like Grafana or cloud-native monitoring.
* DevOps & CI/CD for Data: Experience implementing CI/CD pipelines for data workflows and using IaC tools like Terraform or CloudFormation. Knowledge of data versioning (e.g., Delta Lake time-travel) and supporting continuous delivery for ML systems.
* Continuous Learning: Motivation to explore emerging technologies, especially in AI and generative AI data workflows.
Corporate Security Responsibility
All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:
Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.
Title and Summary
Lead Data Engineer
Lead Data Engineer - Foundry R&D
We are seeking a Lead Data Engineer to join Mastercard Foundry R&D. You will help shape our innovation roadmap by exploring new technologies and building scalable, data-driven prototypes and products. The ideal candidate is hands-on, curious, adaptable, and motivated to experiment and learn.
What You'll Do
* Drive Data Architecture: Own the data architecture and modeling strategy for AI projects. Define how data is stored, organized, and accessed. Select technologies, design schemas/formats, and ensure systems support scalable AI and analytics workloads.
* Build Scalable Data Pipelines: Lead development of robust ETL/ELT workflows and data models. Build pipelines that move large datasets with high reliability and low latency to support training and inference for AI and generative AI systems.
* Ensure Data Quality & Governance: Oversee data governance and compliance with internal standards and regulations. Implement data anonymization, quality checks, lineage, and controls for handling sensitive information.
* Provide Technical Leadership: Offer hands-on leadership across data engineering projects. Conduct code reviews, enforce best practices, and promote clean, well-tested code. Introduce improvements in development processes and tooling.
* Cross-Functional Collaboration: Work closely with engineers, scientists, and product stakeholders. Scope work, manage data deliverables in agile sprints, and ensure timely delivery of data components aligned with project milestones.
What You'll Bring
* Extensive Data Engineering Experience: 8-12+ years in data engineering or backend engineering, including senior/lead roles. Experience designing end-to-end data systems, solving scale/performance challenges, integrating diverse sources, and operating pipelines in production.
* Big Data & Cloud Expertise: Strong skills in Python and/or Java/Scala. Deep experience with Spark, Hadoop, Hive/Impala, and Airflow. Hands-on work with AWS, Azure, or GCP using cloud-native processing and storage services (e.g., S3, Glue, EMR, Data Factory). Ability to design scalable, cost-efficient workloads for experimental and variable R&D environments.
* AI/ML Data Lifecycle Knowledge: Understanding of data needs for machine learning-dataset preparation, feature/label management, and supporting real-time or batch training pipelines. Experience with feature stores or streaming data is useful.
* Leadership & Mentorship: Ability to translate ambiguous goals into clear plans, guide engineers, and lead technical execution.
* Problem-Solving Mindset: Approach issues systematically, using analysis and data to select scalable, maintainable solutions.
Required Skills
* Education & Background: Bachelor's degree in Computer Science, Engineering, or related field. 8-12+ years of proven experience architecting and operating production-grade data systems, especially those supporting analytics or ML workloads.
* Pipeline Development: Expert in ETL/ELT design and implementation, working with diverse data sources, transformations, and targets. Strong experience scheduling and orchestrating pipelines using Airflow or similar tools.
* Programming & Databases: Advanced Python and/or Scala/Java skills and strong software engineering fundamentals (version control, CI, code reviews). Excellent SQL abilities, including performance tuning on large datasets.
* Big Data Technologies: Hands-on Spark experience (RDDs/DataFrames, optimization). Familiar with Hadoop components (HDFS, YARN), Hive/Impala, and streaming systems like Kafka or Kinesis.
* Cloud Infrastructure: Experience deploying data systems on AWS/Azure/GCP. Familiar with cloud data lakes, warehouses (Redshift, BigQuery, Snowflake), and cloud-based processing engines (EMR, Dataproc, Glue, Synapse). Comfortable with Linux and shell scripting.
* Data Governance & Security: Knowledge of data privacy regulations, PII handling, access controls, encryption/masking, and data quality validation. Experience with metadata management or data cataloging tools is a plus.
* Collaboration & Agile Delivery: Strong communication skills and experience working with cross-functional teams. Ability to document designs clearly and deliver iteratively using agile practices.
Preferred Skills
* Advanced Cloud & Data Platform Expertise: Experience with AWS data engineering services, Databricks, and Lakehouse/Delta Lake architectures (including bronze/silver/gold layers).
* Modern Data Stack: Familiarity with dbt, Great Expectations, containerization (Docker/Kubernetes), and monitoring tools like Grafana or cloud-native monitoring.
* DevOps & CI/CD for Data: Experience implementing CI/CD pipelines for data workflows and using IaC tools like Terraform or CloudFormation. Knowledge of data versioning (e.g., Delta Lake time-travel) and supporting continuous delivery for ML systems.
* Continuous Learning: Motivation to explore emerging technologies, especially in AI and generative AI data workflows.
Corporate Security Responsibility
All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:
- Abide by Mastercard's security policies and practices;
- Ensure the confidentiality and integrity of the information being accessed;
- Report any suspected information security violation or breach, and
- Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines.
Top Skills
Airflow
AWS
Azure
CloudFormation
Data Factory
Databricks
Docker
Emr
GCP
Glue
Hadoop
Hive
Impala
Java
Kafka
Kinesis
Kubernetes
Python
S3
Scala
Spark
SQL
Terraform
Mastercard Mumbai, Maharashtra, IND Office
Bandra Kurla Complex Road, Mumbai, Maharashtra, India, 400051
Similar Jobs at Mastercard
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
Lead a software development team, design billing programs, ensure adherence to engineering standards, and resolve complex software issues.
Top Skills:
JavaLinuxMicroservices ArchitectureSpring FrameworkSQL
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
Lead design, implementation, administration, monitoring, tuning, replication, backup/recovery and security of Postgres databases (on-prem and cloud). Drive best practices, support audits, integrate DB operations with CI/CD and AWS, and mentor stakeholders across geographies.
Top Skills:
Amazon AuroraAWSAws LambdaAzureBashBitbucketCassandraCloudFormationCloudwatchDynatraceEnterprisedb (Edb)FlywayGitJenkinsKshOracleOracle LinuxPemPerformance InsightsPerlPl/PerlPl/PgsqlPl/PythonPl/TclPostgresPostgresql Plus Advanced ServerRedisSplunkSQLTerraformXlr
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
The Senior Software Engineer leads architecture and design for Full Stack development, influencing practices and collaborating within the Access Services Team to create scalable applications and improve consumer experiences.
Top Skills:
Angular JsCSSExpress JsHibernateJavaJavaScriptJeeJSONNode JsNoSQLRestSpringSQL
What you need to know about the Mumbai Tech Scene
From haggling for the best price at Chor Bazaar to the bustle of Crawford Market, the energy of Mumbai's traditional markets is a key part of the city's charm. And while these markets will always have their place, the city also boasts a thriving e-commerce scene, ranking among the largest in the region. Driven by online sales in everything from snacks to licensed sports merchandise to children's apparel, the local industry is worth billions, with companies actively recruiting to meet the demands of continued growth.

