At Mitratech, we are a team of technocrats focused on building world-class products that simplify operations in the Legal, Risk, Compliance, and HR functions of Fortune 100 companies. We are a close-knit, globally dispersed team that thrives in an ecosystem that supports individual excellence and takes pride in its diverse and inclusive work culture centered around great people practices, learning opportunities, and having fun! Our culture is the ideal blend of entrepreneurial spirit and enterprise investment, enabling the chance to move at a rapid pace with some of the most complex, leading-edge technologies available.
Given our continued growth, we always have room for more intellect, energy, and enthusiasm - join our global team and see why it's so special to be a part of Mitratech!
Job Overview:
We are looking for an experienced and passionate DevOps & SRE Manager to lead our DevOps and Site Reliability Engineering teams. The ideal candidate will be responsible for building and maintaining scalable, reliable, and high-performing infrastructure and operational processes. As a DevOps & SRE Manager, you will play a key role in ensuring our development, deployment, and operational practices align with industry standards while fostering a culture of automation and continuous improvement.
Key Responsibilities:
Leadership & Team Management:
- Lead, mentor, and develop a team of DevOps engineers and SREs to drive innovation and operational excellence.
- Build a collaborative and inclusive team culture focused on delivering high-quality services.
- Establish and track goals for your team to align with business objectives.
Infrastructure Automation & Scalability:
- Design, implement, and manage highly available and scalable cloud infrastructure (AWS, Azure, or OCI).
- Oversee the implementation of Infrastructure as Code (IaC) tools (e.g., Terraform, Bicep, Ansible etc) to automate provisioning and configuration.
- Identify and address bottlenecks in deployment pipelines and infrastructure performance.
Site Reliability Engineering:
- Lead efforts to define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets.
- Drive incident management processes to quickly detect, mitigate, and resolve issues while ensuring post-mortem analyses for continuous improvement.
- Optimize and enhance monitoring, logging, and alerting systems (e.g., Datadog, Splunk, Prometheus, Grafana, ELK stack).
Continuous Integration and Continuous Deployment (CI/CD):
- Establish and refine CI/CD pipelines to ensure smooth software releases with minimal/zero downtime.
- Collaborate with development teams to implement DevOps best practices and ensure code quality, security, and performance.
Security & Compliance:
- Implement and oversee security best practices in DevOps and operational workflows, including secrets management, vulnerability scans, and automated patching.
- Ensure compliance with relevant regulations and standards (e.g., GDPR, SOC2, ISO 27001).
Collaboration & Communication:
- Work cross-functionally with product, engineering, and operations teams to ensure alignment on goals and priorities.
- Provide regular updates to stakeholders on system health, incidents, and improvement initiatives.
Cost Optimization:
- Analyze cloud and infrastructure costs, identify opportunities for savings, and implement cost optimization strategies.
- Manage budgets and vendor relationships for tools and services used by the team.
Qualifications:
Education:
- Bachelor’s degree in Computer Science, Engineering, or a related field. A Master’s degree is a plus.
Experience:
- Proven experience managing DevOps or SRE teams in fast-paced environments.
- Hands-on expertise in cloud platforms (AWS, Azure, OCI, GCP) and containerization technologies (Docker, Kubernetes).
- Deep understanding of software development lifecycle (SDLC) and Agile practices.
- Track record of driving operational efficiency, incident resolution, and automation.
Technical Skills:
- Expertise in CI/CD tools (e.g., Jenkins, CircleCI, Azure DevOps).
- Experience operating in Kubernetes platforms like AKS, EKS, or similar.
- Experience using managed languages such as Python, Go, C#, Java, or similar.
- Experience designing tooling to simplify the operational management of SaaS/PaaS systems.
- Experience with monitoring and observability tools (e.g., Prometheus, Datadog, ELK Stack).
- Strong knowledge of infrastructure-as-code tools (e.g., Terraform, Bicep, CloudFormation).
Soft Skills:
- Excellent leadership and people management abilities.
- Strong problem-solving skills and attention to detail.
- Exceptional communication skills to collaborate across teams and with stakeholders.
We are an equal-opportunity employer that values diversity at all levels. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, national origin, age, sexual orientation, gender identity, disability, or veteran status.