Location: Navi Mumbai (CBD Belapur)
Work Mode: 5 Days Work From Office
Experience: 5 – 10 Years
Qualification: B.Tech from (CS/AI/ML/DS/EE preferred)
Company Type: Startup (candidate must be willing to work in a startup environment)
We are looking for a highly skilled Senior AI Scientist who is a subject matter expert in advanced AI/ML model development and research. The ideal candidate should have strong hands-on experience with LLMs, transformers, diffusion models, multimodal architectures, and must be comfortable leading research-driven projects in a fast-paced startup environment.
This role requires deep technical expertise, strong coding abilities, and the ability to review, validate, and debug complex ML codebases.
Key Responsibilities1. Research & InnovationLead research in LLMs, diffusion models, transformers, and multimodal AI systems.
Explore state-of-the-art literature and implement new ideas, techniques, and architectures.
Develop prototypes and run experiments for AI model advancement.
Design, train, fine-tune, and optimize deep learning models.
Work with large and diverse datasets across text, image, and multimodal domains.
Deliver production-ready models with high performance and reliability.
Identify, debug, and resolve coding errors in ML pipelines and model scripts.
Maintain clean, high-quality research and production code.
Collaborate closely with engineering teams for smooth integration.
Perform model benchmarking and statistical evaluations.
Conduct ablation studies and experiment documentation.
Present insights and findings with clear technical reasoning.
Strong research experience in deep learning, NLP, CV, generative AI, or related fields.
Hands-on expertise with transformers, LLMs, diffusion models, multimodal systems.
Proficiency in Python, PyTorch / TensorFlow, and modern ML toolkits.
Ability to debug and optimize large-scale ML systems.
Strong mathematical foundations and problem-solving skills.
Willingness to thrive in a fast-paced startup environment.
Research publications, patents, or open-source contributions.
Experience with distributed training, GPU optimization, or cloud ML platforms (AWS/GCP/Azure).
Familiarity with reproducible research tools and ML experimentation frameworks.



