YipitData

Data Pipeline Engineer

Posted 22 Hours Ago

Be an Early Applicant

Remote

Hiring Remotely in India

Mid level

Remote

Hiring Remotely in India

Mid level

As a Data Engineer, you will build and maintain end-to-end data pipelines, support data modeling best practices, and solve complex data pipeline challenges using PySpark and SQL. You'll collaborate with stakeholders to ensure alignment with business needs and participate in training on cutting-edge data tools.

The summary above was generated by AI

About YipitData:

YipitData is the leading market research and analytics firm for the disruptive economy and recently raised up to $475M from The Carlyle Group at a valuation over $1B.

We analyze billions of alternative data points every day to provide accurate, detailed insights on ridesharing, e-commerce marketplaces, payments, and more. Our on-demand insights team uses proprietary technology to identify, license, clean, and analyze the data many of the world’s largest investment funds and corporations depend on.

For three years and counting, we have been recognized as one of Inc’s Best Workplaces. We are a fast-growing technology company backed by The Carlyle Group and Norwest Venture Partners. Our offices are located in NYC, Austin, Miami, Denver, Mountain View, Seattle, Hong Kong, Shanghai, Beijing, Guangzhou, and Singapore. We cultivate a people-centric culture focused on mastery, ownership, and transparency.

Why You Should Apply NOW:

You’ll be working with many strategic engineering leaders within the company.
You’ll report directly to the Director of Data Engineering.
You’ll be one of the founding members of our Data Engineering team presence in India.
You’ll work with a global team spanning three continents.
You’ll be challenged with a lot of big data problems.

About The Role:

We are seeking a highly skilled Data Pipeline Engineer to join our dynamic Data Engineering team. The ideal candidate possesses 3-5 years of data engineering experience. An excellent candidate should have a solid understanding of Spark and SQL, and have data pipeline experience. Hired individuals will play a crucial role in supporting our strategic pipelines and optimize for reliability, efficiency, and performance.

Additionally, Data Engineering serves as the gold standard for all other YipitData analyst teams, building and maintaining the core pipelines and tooling that power our products. This high-impact, high-visibility team is instrumental to the success of our rapidly growing business. This is a unique opportunity to be one of the first hires on this team’s presence in India.

This is a hybrid opportunity based in India. (Currently, the role is remote, but once we establish an office space in India, the role will become hybrid. The hybrid work schedule is flexible, meaning the team [in collaboration with the manager] will determine when they come into the office.)

During training and onboarding, we will expect several hours of overlap with US working hours. Afterward, standard IST working hours are permitted with the exception of 1-2 days per week, when you will join meetings with the US team.

As A Data Pipeline Engineer, You Will:

Report directly to the Director of Data Engineering, who will provide significant, hands-on training on cutting-edge data tools and techniques.
Build and maintain end-to-end data pipelines.
Help with setting best practices for our data modeling and pipeline builds.
Create documentation, architecture diagrams, and other training materials.
Become an expert at solving complex data pipeline issues using PySpark and SQL.
Collaborate with stakeholders to incorporate business logic into our central pipelines.
Deeply learn Databricks, Spark, and other ETL toolings developed internally.

You Are Likely To Succeed If:

You hold a Bachelor’s or Master’s degree in Computer Science, STEM, or a related technical discipline.
You have 3+ years of experience as a Data Engineer or in other technical functions.
You are excited about solving data challenges and learning new skills.
You have a great understanding of working with data or building data pipelines.
You are comfortable working with large-scale datasets using PySpark, Delta, and Databricks.
You understand business needs and the rationale behind data transformations to ensure alignment with organizational goals and data strategy.
You are eager to constantly learn new technologies.
You are a self-starter who enjoys working collaboratively with stakeholders.
You have exceptional verbal and written communication skills.
Nice to have: Experience with Airflow, dbt, Snowflake, or equivalent.

What We Offer:

Our compensation package includes comprehensive benefits, perks, and a competitive salary:

We care about your personal life and we mean it. We offer vacation time, parental leave, team events, learning reimbursement, and more!
Your growth at YipitData is determined by the impact that you are making, not by tenure, unnecessary facetime, or office politics. Everyone at YipitData is empowered to learn, self-improve, and master their skills in an environment focused on ownership, respect, and trust.

We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, marital status, disability, gender, gender identity or expression, or veteran status. We are proud to be an equal-opportunity employer.

Job Applicant Privacy Notice

Top Skills

Airflow

Databricks

Dbt

Pyspark

Snowflake

Spark

SQL

Similar Jobs

Aviatrix

Staff Engineer - Data Pipeline

9 Days Ago

Remote

Senior level

Software

The Staff Engineer for the Data Pipeline team will develop observability modules, focusing on real-time data pipelines for cloud environments. Responsibilities include coding in Golang and Python, completing features independently, and conducting design and code reviews. Must have strong software engineering skills and preferably experience in cloud services and data pipeline development.

Top Skills: AWSAzureData PipelineETLGCPGoKubernetesMicroservicesOciPython

Aviatrix

Principal Engineer - Technical Software Lead - Data Pipeline

22 Days Ago

Remote

Senior level

Software

The Principal Technical Lead will oversee the Aviatrix Data Pipeline team, focusing on designing scalable systems for collecting, processing, and analyzing network data. Responsibilities include leading a team, optimizing data pipelines, ensuring data quality, and integrating analytics tools.

Top Skills: Cloud PlatformsFlinkFluentdGoHadoopKafkaKerasLogstashOpen TelemetryPrometheusPythonPyTorchScikit-LearnSparkStormTensorFlow

Atlassian

Senior Data Scientist - Machine Learning

3 Hours Ago

Remote

Bengaluru, Karnataka, IND

Senior level

Cloud • Information Technology • Productivity • Security • Software • App development • Automation

The Senior Data Scientist will work within the AI, Systems, and Technology team to develop and implement machine learning algorithms, train models, and collaborate with cross-functional teams. They will focus on enhancing AI/ML functionality for customer support and sales, requiring effective communication with stakeholders and proficiency in various machine learning methods.

Top Skills: Artificial IntelligenceAWSDatabricksJavaLlmMachine LearningNlpPythonSparkSQL

What you need to know about the Mumbai Tech Scene

From haggling for the best price at Chor Bazaar to the bustle of Crawford Market, the energy of Mumbai's traditional markets is a key part of the city's charm. And while these markets will always have their place, the city also boasts a thriving e-commerce scene, ranking among the largest in the region. Driven by online sales in everything from snacks to licensed sports merchandise to children's apparel, the local industry is worth billions, with companies actively recruiting to meet the demands of continued growth.