As an AI Engineer at Crop.photo, you'll build AI features for image generation, design scalable APIs, integrate models into production, and ensure high-quality visual outputs without relying on formal handoffs.
We’re Crop.photo — a high-velocity AI startup powering creative automation for global brands like Lacoste, UrbanOutfitters, and AP News. We help brands produce high-quality visuals — images, ads, banners — at scale, and we’re building the core visual intelligence engine that makes that possible.
Our engineers don’t just write code. They frame product logic, shape UX behavior, and ship features. No PMs handing down tickets. No design handoffs. If you think like an owner and love combining deep ML logic with hard product edges — this role is for you. You’ll be working on systems focused on the transformation and generation of millions of visual assets for small-to-large enterprises at scale.
What You’ll Do- Build and own AI-backed features end to end, from ideation to production — including layout logic, smart cropping, visual enhancement, out-painting and GenAI workflows for background fills
- Design scalable APIs that wrap vision models like BiRefNet, YOLOv8, Grounding DINO, SAM, CLIP, ControlNet, etc., into batch and real-time pipelines.
- Write production-grade Python code to manipulate and transform image data using NumPy, OpenCV (cv2), PIL, and PyTorch.
- Handle pixel-level transformations — from custom masks and color space conversions to geometric warps and contour ops — with speed and precision.
- Integrate your models into our production web app (AWS based Python/Java backend) and optimize them for latency, memory, and throughput
- Frame problems when specs are vague — you’ll help define what “good” looks like, and then build it
- Collaborate with product, UX, and other engineers without relying on formal handoffs — you own your domain
- 4–6 years of hands-on experience with vision and image generation models such as YOLO, Grounding DINO, SAM, CLIP, Stable Diffusion, VITON, or TryOnGAN — including experience with inpainting and outpainting workflows using Stable Diffusion pipelines (e.g., Diffusers, InvokeAI, or custom-built solutions)
- Strong hands-on knowledge of NumPy, OpenCV, PIL, PyTorch, and image visualization/debugging techniques.
- 2–3 years of experience working with popular LLM APIs such as OpenAI, Anthropic, Gemini and how to compose multi-modal pipelines
- Solid grasp of production model integration — model loading, GPU/CPU optimization, async inference, caching, and batch processing.
- Experience solving real-world visual problems like object detection, segmentation, composition, or enhancement.
- Ability to debug and diagnose visual output errors — e.g., weird segmentation artifacts, off-center crops, broken masks.
- Deep understanding of image processing in Python: array slicing, color formats, augmentation, geometric transforms, contour detection, etc.
- Experience building and deploying FastAPI services and containerizing them with Docker for AWS-based infra (ECS, EC2/GPU, Lambda).
- Solid grasp of production model integration — model loading, GPU/CPU optimization, async inference, caching, and batch processing.
- A customer-centric approach — you think about how your work affects end users and product experience, not just model performance
- A quest for high-quality deliverables — you write clean, tested code and debug edge cases until they’re truly fixed
- The ability to frame problems from scratch and work without strict handoffs — you build from a goal, not a ticket
- You’ve built systems — not just prototypes
- You care about both ML results and the system’s behavior in production
- You’re comfortable taking a rough business goal and shaping the technical path to get there
- You’re energized by product-focused AI work — things that users feel and rely on
- You’ve worked in or want to work in a startup-grade environment: messy, fast, and impactful
- Full autonomy over your problem space
- A builder-first, no-handoff culture
- Remote-first flexibility (India preferred)
- Base + Variable + meaningful equity
- A product shipping to some of the world’s most recognizable brands
Top Skills
AWS
Clip
Docker
Fastapi
Grounding Dino
Numpy
Opencv
Pil
Python
PyTorch
Sam
Stable Diffusion
Yolo
Similar Jobs
Financial Services
As a Business Analyst in Payments Regulation Operations, you gather business requirements, support regulatory compliance, drive process improvements, and collaborate with cross-functional teams.
Top Skills:
AgileConfluenceJIRAExcelMicrosoft PowerpointMicrosoft VisioWaterfall
Financial Services
The Client Service Analyst oversees daily workflows, manages client service inquiries, and enhances team efficiency through strategic planning and operational controls.
Top Skills:
MS Office
Cloud • Information Technology • Productivity • Software • Automation
Manage and lead complex technical projects in Cyber Security and IT Infrastructure, ensuring alignment with strategic goals and compliance requirements.
Top Skills:
Cyber SecurityGoogle WorkspaceIt InfrastructureJIRA
What you need to know about the Mumbai Tech Scene
From haggling for the best price at Chor Bazaar to the bustle of Crawford Market, the energy of Mumbai's traditional markets is a key part of the city's charm. And while these markets will always have their place, the city also boasts a thriving e-commerce scene, ranking among the largest in the region. Driven by online sales in everything from snacks to licensed sports merchandise to children's apparel, the local industry is worth billions, with companies actively recruiting to meet the demands of continued growth.


