ZainTECH Logo

ZainTECH

Data & AI Operations Specialist

Posted 25 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in India
Mid level
Remote
Hiring Remotely in India
Mid level
The Data & AI Operations Specialist leads technical operations for AI infrastructure, manages data pipelines, and oversees MLOps across multi-cloud environments, ensuring compliance and performance optimization.
The summary above was generated by AI

The Data & Operations AI Specialist serves as the Level 3 technical lead for Artificial Intelligence and Data Platform estate. You will be responsible for the architecture, engineering, and advanced troubleshooting of AI infrastructure, data pipelines, and MLOps lifecycles across a multi-cloud environment (Azure and OCI).

Responsibilities:

AI Infrastructure & Platform Engineering

  • Design & Architecture: Maintain the monitoring architecture for AI/ML platforms and configure advanced dashboards in Grafana and Azure Monitor.
  • Environment Governance: Manage Azure Machine Learning (AML) workspace configurations, compute targets, and Databricks cluster lifecycles (including runtime versions and platform patching).
  • Resource Optimization: Oversee GPU resource allocation, reserved capacity, and cost-performance optimization to align with FinOps goals.
  • Security Integration: Ensure all AI services utilize private endpoints, VNET integration, and RBAC controls to protect sensitive citizen data.

Data Pipeline & ETL Management

  • Pipeline Engineering: Own the design, optimization, and remediation of Azure Data Factory (ADF) and Synapse pipelines.
  • Advanced Troubleshooting: Resolve complex bottlenecks related to authentication failures, data format changes, and ETL performance.
  • SOP Leadership: Author step-by-step Standard Operating Procedures (SOPs) for the L1 NOC team to handle routine monitoring and first-line triage.

MLOps & Model Lifecycle

  • Automation: Implement CI/CD pipelines for model training, testing, and deployment to AML endpoints.
  • Model Reliability: Configure data drift detection thresholds and automated retraining triggers.
  • Recovery Operations: Develop self-healing scripts and automated recovery runbooks for critical AI workflows.

Governance & Compliance

  • Audit Management: Implement and maintain audit logging for all AI decisions and model outputs, ensuring logs flow to the SIEM/vSOC.
  • Regulatory Alignment: Conduct quarterly AI governance reviews to ensure compliance with NESA standards and data privacy guidelines.

Requirements
  • AI/ML Platforms: Deep expertise in Azure Machine Learning and Databricks.
  • Data Integration: Proficiency in Azure Data Factory and Synapse.
  • Infrastructure-as-Code (IaC): Experience with Terraform or ARM Templates for reproducible deployments.
  • Observability: Ability to use Dynatrace, Grafana, and Azure Monitor for deep-tier diagnostics.
  • Containerization: Knowledge of AKS, Istio Service Mesh, and KEDA.
  • ITIL Mastery: Strong understanding of ITIL-aligned Incident, Change, and Problem management.
  • Security Mindset: Familiarity with NESA standards and UAE data residency requirements.
  • Technical Writing: Ability to draft complex SOPs and Root Cause Analysis (RCA) documents within 48 hours of an incident.
  • Certifications: Microsoft Azure Data Scientist Associate or Azure AI Engineer Associate is highly preferred.

Similar Jobs

An Hour Ago
Easy Apply
Remote
India
Easy Apply
Senior level
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
The role involves leading SOX control testing, managing outsourced teams, ensuring compliance, and collaborating with internal partners. It requires extensive experience in SOX audits and financial management.
Top Skills: Data Lineage ToolsExcelGenerative AiGoogle SuiteWorkflow Automation
8 Hours Ago
Remote or Hybrid
Senior level
Senior level
Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
Design, develop, and maintain Salesforce solutions across Sales, Service, and Experience Clouds including CPQ. Perform Apex and LWC development, unit testing, deployments (Gearset), integrations, and support SOX-compliant change management while collaborating with global teams.
Top Skills: ApexChange SetsExperience CloudFlowGearsetGitJIRALightning Web Components (Lwc)Process BuilderRest ApiSales CloudSalesforce CpqSalesforce Data LoaderSalesforce DxService CloudSoap ApiSOQLSoslValidation RulesVisualforce
8 Hours Ago
Remote or Hybrid
Mid level
Mid level
Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
Contribute to the development and monitoring of ML and LLM-based security models, including data acquisition, model evaluation, and deployment on AWS infrastructure.
Top Skills: AWSBedrockCloudwatchGithub ActionsHuggingface TransformersJenkinsLambdaLangchainNumpyPandasPythonPyTorchS3SagemakerScikit-LearnTensorFlow

What you need to know about the Mumbai Tech Scene

From haggling for the best price at Chor Bazaar to the bustle of Crawford Market, the energy of Mumbai's traditional markets is a key part of the city's charm. And while these markets will always have their place, the city also boasts a thriving e-commerce scene, ranking among the largest in the region. Driven by online sales in everything from snacks to licensed sports merchandise to children's apparel, the local industry is worth billions, with companies actively recruiting to meet the demands of continued growth.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account