Forbes Advisor Logo

Forbes Advisor

Data Research Engineer - Data Extraction

Posted 12 Hours Ago
Remote
Hiring Remotely in Mumbai, Maharashtra
Mid level
Remote
Hiring Remotely in Mumbai, Maharashtra
Mid level
The Data Research Engineer will develop and oversee data extraction frameworks, ensuring data quality, implementing ETL processes, and collaborating with teams on effective data acquisition methods.
The summary above was generated by AI

Company Description

Forbes Advisor is a new initiative for consumers under the Forbes Marketplace umbrella that provides journalist- and expert-written insights, news and reviews on all things personal finance, health, business, and everyday life decisions. We do this by providing consumers with the knowledge and research they need to make informed decisions they can feel confident in, so they can get back to doing the things they care about most.

At Marketplace, our mission is to help readers turn their aspirations into reality. We arm people with trusted advice and guidance, so they can make informed decisions they feel confident in and get back to doing the things they care about most.

We are an experienced team of industry experts dedicated to helping readers make smart decisions and choose the right products with ease. Marketplace boasts decades of experience across dozens of geographies and teams, including Content, SEO, Business Intelligence, Finance, HR, Marketing, Production, Technology and Sales. The team brings rich industry knowledge to Marketplace’s global coverage of consumer credit, debt, health, home improvement, banking, investing, credit cards, small business, education, insurance, loans, real estate and travel.

The Data Extraction Team is a brand-new team who plays a crucial role in our organization by designing, implementing, and overseeing advanced web scraping frameworks. Their core function involves creating and refining tools and methodologies to efficiently gather precise and meaningful data from a diverse range of digital platforms. Additionally, this team is tasked with constructing robust data pipelines and implementing Extract, Transform, Load (ETL) processes. These processes are essential for seamlessly transferring the harvested data into our data storage systems, ensuring its ready availability for analysis and utilization.

A typical day in the life of a Data Research Engineer will involve acquiring and integrating data from various sources, developing and maintaining data processing workflows, and ensuring data quality and reliability. They collaborate with the team to identify effective data acquisition strategies and develop Python scripts for data extraction, transformation, and loading processes. They also contribute to data validation, cleansing, and quality checks. The Data Research Engineer stays updated with emerging data engineering technologies and best practices.

 

Job Description

Responsibilities:

• Develop methods and processes for data quality assurance (QA) to ensure accuracy, completeness, and integrity.

• Define and implement data validation rules and automated data quality checks.

• Perform data profiling and analysis to identify anomalies, outliers, and inconsistencies.

• Assist in acquiring and integrating data from various sources, including web crawling and API integration.

• Develop and maintain scripts in Python for data extraction, transformation, and loading (ETL) processes.

• Stay updated with emerging technologies and industry trends.

• Explore third-party technologies as alternatives to legacy approaches for efficient data pipelines.

• Contribute to cross-functional teams in understanding data requirements.

• Assume accountability for achieving development milestones.

• Prioritize tasks to ensure timely delivery, in a fast-paced environment with rapidly changing priorities.

• Collaborate with and assist fellow members of the Data Research Engineering Team as required.

• Leverage online resources effectively like StackOverflow, ChatGPT, Bard, etc., while considering their capabilities and limitations

Qualifications

Skills and Experience

• Bachelor's degree in Computer Science, Data Science, or a related field.

• Strong proficiency in Python programming for data extraction, transformation, and loading.

• Proficiency in SQL and data querying is a plus.

• Knowledge of Python modules such as Pandas, SQLAlchemy, gspread, PyDrive, BeautifulSoup and Selenium, sklearn, Plotly.

• Knowledge of web crawling techniques and API integration.

• Knowledge of data quality assurance methodologies and techniques.

• Experience in AI/ML engineering and data extraction

• Experience with LLMs, NLP frameworks (spaCy, NLTK, Hugging Face, etc.)

• Strong understanding of machine learning frameworks (TensorFlow, PyTorch)

• Experience with RESTful API design and integration • Design and build AI models using LLMs

• Integrate LLM solutions with existing systems via APIs

• Collaborate with the team to implement and optimize AI solutions

• Monitor and improve model performance and accuracy

• Familiarity with HTML, CSS, JavaScript.

• Familiarity with Agile development methodologies is a plus.

• Strong problem-solving and analytical skills with attention to detail.

• Creative and critical thinking.

• Ability to work collaboratively in a team environment.

• Good and effective communication skills.

• Experience with version control systems, such as Git, for collaborative development.

• Ability to thrive in a fast-paced environment with rapidly changing priorities.

• Comfortable with autonomy and ability to work independently.

Additional Information

Perks:

● Day off on the 3rd Friday of every month (one long weekend each month)

● Monthly Wellness Reimbursement Program to promote health well-being

● Monthly Office Commutation Reimbursement Program

● Paid paternity and maternity leaves

Top Skills

Beautifulsoup
CSS
Gspread
HTML
JavaScript
Pandas
Plotly
Pydrive
Python
PyTorch
Restful Api
Selenium
Sklearn
SQL
Sqlalchemy
TensorFlow

Similar Jobs

15 Days Ago
Remote
Mumbai, Maharashtra, IND
Junior
Junior
Insurance • Software • Energy • Financial Services
The Data Research Engineer will develop data extraction methods, ensure data quality, and collaborate on data processing workflows, utilizing Python and SQL.
Top Skills: BeautifulsoupCSSGspreadHTMLJavaScriptPandasPlotlyPydrivePythonSeleniumSklearnSQLSqlalchemy
21 Hours Ago
Remote
India
Senior level
Senior level
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
The Senior Machine Learning Systems Engineer will lead infrastructure for AI & ML tools, tackling complex challenges, mentoring junior members, and collaborating across teams.
Top Skills: Java,Kotlin,Aws,Sagemaker,S3,Cloud Formation
Yesterday
Remote
Hybrid
India
Mid level
Mid level
Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
The Team Leader in Technology Services oversees testing execution, enhances UAT processes, conducts post-production testing, and manages defect resolution while collaborating with the Product Owner.
Top Skills: AzureExcelMs PowerpointMs VisioMs Word

What you need to know about the Mumbai Tech Scene

From haggling for the best price at Chor Bazaar to the bustle of Crawford Market, the energy of Mumbai's traditional markets is a key part of the city's charm. And while these markets will always have their place, the city also boasts a thriving e-commerce scene, ranking among the largest in the region. Driven by online sales in everything from snacks to licensed sports merchandise to children's apparel, the local industry is worth billions, with companies actively recruiting to meet the demands of continued growth.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account