Datavail Logo

Datavail

Senior Associate Cloud SRE

Posted 13 Days Ago
Be an Early Applicant
Hybrid
Mumbai, Maharashtra, IND
Senior level
Hybrid
Mumbai, Maharashtra, IND
Senior level
The Senior Associate Cloud SRE delivers tier two cloud operations support for AWS and Azure, focusing on operational excellence, automation, and service reliability while managing complex issues and collaborating with teams.
The summary above was generated by AI

Job Title: Senior Associate Cloud SRE 

Education: Any Graduate

Experience: 4–8 years

Location: Mumbai (Hybrid Model)

Employment Type: Full-time

 

Overview

We are seeking a Site Reliability Engineer to deliver tier two cloud operations managed services support across AWS and Azure environments. This role combines advanced troubleshooting and operational excellence with proactive reliability engineering, focusing on maintaining 24x7x365 service availability while continuously improving automation and operational efficiency across multi-cloud infrastructure.

 

Role Summary

As a Site Reliability Engineer supporting multi-cloud infrastructure (AWS and Azure), you will manage complex operational challenges and escalations while implementing reliability best practices across production systems. You will work collaboratively with customer teams and senior engineers to ensure system stability, automate operational workflows, and maintain comprehensive observability. This is a delivery-focused role requiring both advanced technical execution and operational ownership across cloud platforms.

 

Primary Responsibilities:

 

Tier 2 Multi-Cloud Operations & Managed Services:

AWS Operations:

  • Provide 24x7x365 tier two support and escalation handling for AWS environments

  • Execute complex operational tasks including: 

  • Patching and managing Amazon Machine Images (AMIs)

  • Creating and configuring EC2 instances and RDS databases

  • Managing IAM roles, users, and policies

  • Configuring S3 bucket policies and Access Control Lists (ACLs)

  • Opening and managing network routes (VPC, subnets, security groups)

  • Restoring snapshots and database backups to lower environments

  • Increasing disk sizes (EBS volumes) and managing storage optimization

  • Implementing proper tagging for environment identification and cost allocation

  • Managing logs archiving using CloudWatch Logs and S3

 

Azure Operations:

  • Provide equivalent tier two support for Azure cloud environments

  • Execute Azure-specific operational tasks including: 

  • Managing and updating Azure Virtual Machine images

  • Creating and configuring Azure Virtual Machines and Azure SQL databases

  • Managing Azure Active Directory (AAD) identities, roles, and role-based access control (RBAC)

  • Configuring Azure Storage account policies and access controls

  • Managing Virtual Networks, Network Security Groups (NSGs), and route tables

  • Restoring VM snapshots and database backups to lower environments

  • Managing disk resizing and Azure Managed Disks optimization

  • Implementing Azure resource tagging and cost management

  • Managing log archiving using Azure Monitor and Log Analytics

 

Cross-Cloud Responsibilities:

  • Handle escalations from tier one support with deep technical analysis across both platforms

  • Provide root cause analysis for complex incidents in multi-cloud environments

  • Implement consistent operational standards across AWS and Azure

  • Support hybrid cloud connectivity and integration scenarios

 

Reliability & Incident Management:

  • Implement and maintain Service Level Indicators (SLIs) and Service Level Objectives (SLOs) across AWS and Azure in collaboration with senior engineers and customer stakeholders

  • Lead tier two incident response, performing advanced troubleshooting and resolution on both cloud platforms

  • Conduct thorough post-incident analysis with actionable remediation plans

  • Reduce reactive work by improving runbooks, alert configurations, and standard operating procedures for both clouds

  • Apply reliability engineering best practices with oversight and review

  • Mentor tier one engineers during incident response across multi-cloud scenarios

 

Automation & Infrastructure as Code:

  • Build and maintain CI/CD pipelines for infrastructure and application deployments on AWS and Azure

  • Automate complex operational tasks including patching, backups, and environment provisioning across both platforms

  • Develop infrastructure automation using Terraform for multi-cloud environments

  • Create sophisticated scripts and tooling to eliminate manual toil and improve operational efficiency

  • Implement Azure Resource Manager (ARM) templates or Bicep for Azure-specific automation

  • Follow established patterns and contribute continuous improvements

  • Document automation processes for knowledge sharing across cloud platforms

 

Containerization & Deployment:

  • Deploy and operate containerized workloads using Docker on AWS services (ECS, EKS) and Azure services (AKS, Azure Container Instances)

  • Support container reliability through proper health checks, autoscaling configurations, and resource management on both platforms

  • Implement safe deployment patterns (canary deployments, blue/green deployments) across AWS and Azure

  • Troubleshoot complex containerization and orchestration issues in multi-cloud Kubernetes environments

  • Follow and enhance established containerization standards across both cloud providers

Observability & Performance:

  • Configure and maintain comprehensive monitoring, logging, and alerting systems across AWS CloudWatch and Azure Monitor

  • Leverage observability data to identify issues and lead root cause analysis in multi-cloud environments

  • Contribute to performance tuning and cost optimization initiatives across both platforms

  • Ensure proper instrumentation and telemetry across AWS and Azure environments

  • Identify patterns and trends to prevent future incidents

  • Build custom dashboards and reports using CloudWatch, Azure Monitor, and third-party tools (Datadog, Grafana)

 

Collaboration & Customer Engagement:

  • Work closely with customer development and operations teams to improve system operability across cloud platforms

  • Participate in design reviews and reliability assessments for multi-cloud architectures

  • Communicate technical concepts, tradeoffs, and recommendations clearly to stakeholders

  • Provide regular operational updates and service reports covering both AWS and Azure

  • Act as technical liaison between customers and internal engineering teams

 

Required Qualifications & Experience:

  • 3–5 years of hands-on experience in DevOps, SRE, or production operations roles

  • Proven experience operating production systems in AWS OR Azure (deep expertise in one required)

  • Working knowledge or exposure to the secondary cloud platform (ability to learn and support)

  • Demonstrated experience managing containerized applications in production

  • Experience delivering managed services or supporting customer-facing infrastructure

  • Track record of handling complex technical escalations in cloud environments

  • Technical Skills - Primary Cloud Platform (AWS OR Azure)

 

For AWS-Primary Candidates:

  • AWS Services (Expert): Deep knowledge of EC2, RDS, S3, IAM, VPC, CloudWatch, Lambda, and related services

  • AWS Networking (Expert): Strong experience with VPCs, subnets, security groups, route tables, and VPN/Direct Connect

  • AWS Storage (Expert): Proficiency with EBS, S3, and backup/restore strategies

  • AWS Containers (Expert): Hands-on experience with ECS, EKS, or Fargate

  • Azure (Foundational): Basic understanding of Azure services with willingness to learn; exposure to Azure VMs, Storage, or networking is a plus

 

For Azure-Primary Candidates:

  • Azure Services (Expert): Deep knowledge of Azure VMs, Azure SQL, Storage Accounts, Azure AD, Virtual Networks, Azure Monitor

  • Azure Networking (Expert): Strong experience with VNets, NSGs, Application Gateway, Azure Firewall, and ExpressRoute

  • Azure Storage (Expert): Proficiency with Managed Disks, Blob Storage, and Azure Backup

  • Azure Containers (Expert): Hands-on experience with AKS (Azure Kubernetes Service) and Azure Container Instances

  • AWS (Foundational): Basic understanding of AWS services with willingness to learn; exposure to EC2, S3, or VPC is a plus

 

Technical Skills - Cross-Platform (All Candidates):

  • Infrastructure as Code: Proficiency with Terraform (preferred) or CloudFormation/ARM templates

  • CI/CD: Experience building and maintaining automated deployment pipelines (Azure DevOps, GitHub Actions, Jenkins, GitLab CI)

  • Scripting/Programming: Proficiency in Python, PowerShell, Bash, or similar languages

  • Containerization: Strong Docker skills and Kubernetes experience

  • Monitoring & Logging: Experience with cloud-native monitoring tools and/or third-party observability platforms (Datadog, Splunk, ELK, Grafana)

  • Version Control: Proficiency with Git and collaborative development workflows

  • Troubleshooting: Advanced diagnostic and problem-solving capabilities

 

Operational Capabilities:

  • Experience with 24x7 operations and tier two escalation support

  • Strong troubleshooting and root cause analysis skills

  • Understanding of networking concepts, security best practices, and compliance requirements

  • Familiarity with backup/restore procedures and disaster recovery planning

  • Ability to work under pressure during critical incidents

  • Experience coordinating across distributed teams

  • Willingness and ability to quickly learn the secondary cloud platform

 

Preferred Qualifications & Certifications:

  • AWS Certifications (for AWS-primary): Solutions Architect Associate, SysOps Administrator, or DevOps Engineer Professional

  • Azure Certifications (for Azure-primary): Azure Administrator Associate (AZ-104) or Azure Solutions Architect Expert (AZ-305)

  • Cloud-agnostic certifications (Terraform Associate, CKA, or SRE Foundation)

 

Additional Preferred Experience:

  • Any hands-on experience with both AWS and Azure (even if limited in one)

  • Experience with Kubernetes in production environments

  • Prior consulting or managed services provider experience

  • Experience with hybrid cloud or cloud migration projects

  • Experience with configuration management tools (Ansible, Chef, Puppet)

  • Knowledge of security and compliance frameworks (HIPAA, SOC 2, PCI-DSS)

  • Experience in high-traffic or mission-critical industries

  • Experience with cost optimization and FinOps practices

Multi-cloud architecture or implementation experience

About UsDatavail is a leading provider of data management, application development, analytics, and cloud services, with more than 1,000 professionals helping clients build and manage applications and data via a world-class tech-enabled delivery platform and software solutions across all leading technologies. For more than 17 years, Datavail has worked with thousands of companies spanning different industries and sizes, and is an AWS Advanced Tier Consulting Partner, a Microsoft Solutions Partner for Data & AI and Digital & App Innovation (Azure), an Oracle Partner, and a MySQL Partner. About the Team
Datavail’s Team of Cloud Experts Can Save You Time and Money
Our Cloud experts are capable to overcome every obstacle in helping clients manage everything from databases, analytics, reporting, migrations, and upgrades to monitoring and overall data management.
You can free up your IT resources to focus on growing your business rather than fighting fires. Our Cloud experts can guide you through strategic initiatives or support routine database management.
Cloud Managed Services
Datavail’s business focuses on helping you use your data to drive business results through cost-saving services. The success of your business depends on how well you understand and manage your data. Our managed cloud services give you the power to unleash your organization’s potential. We provide comprehensive and technically advanced support for Cloud Operation to ensure that your infrastructure is safe, secure, and managed with the utmost level of care.
Our delivery performance in data management leads the industry. We offer highly trained Cloud administrators via a 24×7, always on, always available, global delivery model.
With the combination of a proven delivery model and top-notch experience ensures that Datavail will remain the Cloud experts on demand you desire. Datavail’s flexible and client focused services always add value to your organization.

Top Skills

AWS
Azure
Azure Monitor
Bash
Ci/Cd
Cloudwatch
Docker
Kubernetes
Powershell
Python
Terraform

Similar Jobs

13 Days Ago
Hybrid
Mumbai, Maharashtra, IND
Mid level
Mid level
Database
The role involves delivering cloud operations support for AWS, managing incidents, improving automation, and ensuring system reliability. Responsibilities include advanced troubleshooting, infrastructure management, and collaboration with teams.
Top Skills: AWSBashCi/CdCloudwatchDatadogDockerEc2ElkGitGoIamPythonRdsS3SplunkTerraformVpc
An Hour Ago
Hybrid
Junior
Junior
Automotive • Hardware • Robotics • Software • Transportation • Manufacturing
The Junior PLM Developer will participate in requirement meetings, implement solutions with Teamcenter, configure integrations, and manage access control.
Top Skills: AwcBmideCatiaDevOpsGit HubItk UtilitiesNxSvnTeamcenter
An Hour Ago
Hybrid
Mid level
Mid level
Automotive • Hardware • Robotics • Software • Transportation • Manufacturing
The IT Security Analyst ensures the protection of systems and data from cyber threats through monitoring, incident response, and compliance support.
Top Skills: CrowdstrikeDlpGdprIso 27001It SecurityNistSoc 2SoxTisax

What you need to know about the Mumbai Tech Scene

From haggling for the best price at Chor Bazaar to the bustle of Crawford Market, the energy of Mumbai's traditional markets is a key part of the city's charm. And while these markets will always have their place, the city also boasts a thriving e-commerce scene, ranking among the largest in the region. Driven by online sales in everything from snacks to licensed sports merchandise to children's apparel, the local industry is worth billions, with companies actively recruiting to meet the demands of continued growth.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account