Job Description:
Title: Lead data engineerDCF Level: L40
About the Role
We are seeking a highly skilled and delivery-focused Lead GCP Data Engineer to support the design, development, and implementation of next-generation enterprise data and AI platforms on Google Cloud Platform (GCP).
This role will work closely with Enterprise Architects, platform leaders, and cross-functional engineering teams to build scalable, reusable, and AI-ready data foundations that enable advanced analytics, intelligent automation, and enterprise AI adoption.
The ideal candidate combines strong hands-on expertise in cloud-native data engineering, modern data platform development, semantic data enablement, and scalable pipeline engineering with the ability to lead engineering teams and drive high-quality delivery across multiple initiatives.
This role is expected to play a critical leadership position within the engineering organization by driving implementation excellence, mentoring teams, and operationalizing modern data architecture patterns.
Key Responsibilities
1. Enterprise Data Platform Engineering
- Design, develop, and optimize scalable cloud-native data platforms and pipelines on GCP.
- Implement robust batch, streaming, and event-driven data processing solutions supporting enterprise analytics and AI use cases.
- Collaborate with Enterprise Architects to translate target-state architecture into scalable engineering implementations.
- Contribute to modernization of legacy data ecosystems into reusable, governed, and AI-ready cloud platforms.
- Support implementation of scalable ingestion, transformation, serving, and orchestration frameworks.
2. Data Product Engineering
- Develop reusable and domain-oriented data products aligned with data mesh and data-as-a-product principles.
- Implement scalable and modular data pipelines supporting multiple downstream consumers including analytics, AI/ML, and operational applications.
- Contribute to implementation of:
- Data contracts
- Schema management
- Metadata enrichment
- Data quality frameworks
- Reusable transformation patterns
- Enable discoverability, trust, and operational reliability of enterprise data assets.
3. Semantic Layer & Consumption Enablement
- Support implementation of semantic and business-consumption layers that simplify enterprise data access.
- Collaborate with analytics and BI teams to enable standardized business metrics, reusable dimensions, and governed KPI definitions.
- Contribute to semantic modeling and metadata integration initiatives supporting self-service analytics and AI consumption.
- Assist in improving enterprise data usability, consistency, and discoverability across platforms.
4. GCP-Native Engineering & Development
- Develop and optimize solutions leveraging GCP-native services including:
- BigQuery
- Dataflow
- Dataproc
- DBT
- Pub/Sub
- Cloud Storage
- Cloud Composer (Airflow)
- Cloud SQL
- Build scalable ETL/ELT frameworks and real-time streaming pipelines.
- Optimize data processing performance, reliability, scalability, and cost efficiency.
- Implement CI/CD pipelines and engineering automation for data platform delivery.
5. AI/ML & GenAI Data Enablement
- Build AI-ready data pipelines and scalable feature engineering workflows supporting enterprise AI initiatives.
- Support integration with:
- Vertex AI
- BigQuery ML
- Vector databases
- LangChain
- Generative AI Studio
- Contribute to implementation of RAG architectures, semantic search, and AI-assisted data interaction patterns.
- Partner with AI/ML teams to operationalize scalable ML and GenAI workflows.
6. Engineering Leadership & Delivery Excellence
- Lead day-to-day engineering activities across multiple data engineering workstreams.
- Guide and mentor junior and mid-level data engineers on modern engineering best practices.
- Ensure adherence to coding standards, architecture guidelines, and operational best practices.
- Drive engineering quality through automated testing, observability, monitoring, and performance optimization.
- Collaborate with architects, product owners, analysts, and client stakeholders to ensure successful delivery outcomes.
7. Governance, Reliability & Observability
- Implement data governance, lineage, monitoring, and observability frameworks.
- Support enforcement of enterprise standards around security, reliability, scalability, and operational readiness.
- Contribute to platform monitoring, incident management, and continuous improvement initiatives.
- Ensure production readiness of pipelines and data services through robust testing and validation processes.
Technical Expertise Required
Area
Skills / Technologies
Cloud Data Engineering
GCP, BigQuery, Dataflow, Dataproc, Pub/Sub, Cloud Storage, Cloud SQL
Data Transformation
DBT, PySpark, SQL, ETL/ELT frameworks
Streaming & Pipelines
Apache Beam, real-time processing, event-driven architectures
Semantic Layer & Modeling
Semantic modeling concepts, Looker modeling, business metrics standardization
AI/ML Enablement
Vertex AI, BigQuery ML, LangChain, Vector Databases, GenAI integration
Orchestration & Automation
Cloud Composer (Airflow), CI/CD, Workflows
Metadata & Governance
Data Catalog, lineage, metadata management, observability frameworks
Programming
Python, SQL, PySpark
Qualifications
- Bachelor’s or Master’s degree in Computer Science, Engineering, Information Systems, or related field.
- 7+ years of experience in data engineering and cloud-native data platform development.
- Minimum 4+ years of hands-on experience delivering enterprise-scale solutions on GCP.
- Strong expertise in building scalable batch and streaming data pipelines.
- Experience working on modern enterprise data platforms supporting analytics, AI/ML, and GenAI use cases.
- Good understanding of semantic layer concepts, reusable data models, and governed data consumption patterns.
- Experience working within large-scale data modernization and cloud transformation initiatives.
- Strong problem-solving, debugging, and performance optimization skills.
- Proven ability to lead engineering teams and collaborate across architecture, product, and business functions.
- Excellent communication and stakeholder management skills.
- GCP certifications such as Professional Data Engineer preferred.
Location:
DGS India - Mumbai - Thane Ashar IT ParkBrand:
MerkleTime Type:
Full timeContract Type:
Permanent
