Interface AI Logo

Interface AI

DevOps Engineer III

Posted 4 Days Ago
Be an Early Applicant
Easy Apply
Remote
Hiring Remotely in India
Senior level
Easy Apply
Remote
Hiring Remotely in India
Senior level
The DevOps Engineer III will own and design infrastructure end-to-end, enable AI workloads, manage cloud-native environments, and mentor teams.
The summary above was generated by AI

Banking is being reimagined—and customers expect every interaction to be easy, personal, and instant

We are building a universal banking assistant that millions of U.S. consumers can use to transact across all financial institutions and, over time, autonomously drive their financial goals. Powered by our proprietary BankGPT platform, this assistant is positioned to displace age-old legacy systems within financial institutions and own the end-to-end CX stack, unlocking a $200B opportunity and potentially replacing multiple publicly traded companies

Ultimately, our mission is to drive financial well-being for millions of consumers.

With over two-thirds of Americans living paycheck to paycheck, 50% holding less than $500 in savings, and only 17% financially literate, we aim to put financial well-being on autopilot to help solve this problem.


Role – DevOps Engineer III 

Location: India (Remote)
Function:
Engineering – Product Engineering
Level:
Senior
Reports to:
Engineering Manager – Product Engineering

About the Role

At interface.ai, we are building BankGPT – the world’s first AI-powered digital banking platform that leverages large language models, multi-agent orchestration, real-time streaming, and voice AI. To support this mission, we are seeking a DevOps Engineer III who will own infrastructure end-to-end, design systems from scratch, and enable highly resilient AI workloads at scale.

This is a senior, hands-on role that requires deep expertise in cloud-native DevOps, infrastructure automation, observability, and security. You’ll not only build and optimize systems but also influence best practices, mentor peers, and contribute to critical decision-making around platform reliability and scalability.


What You’ll Do

  • Infrastructure Ownership – Design, implement, and scale infra across AWS, GCP, or Azure; drive high availability, multi-AZ, and DR/BCP strategies.
  • Cloud-Native Enablement – Build and manage Kubernetes clusters (EKS/GKE), service mesh (Istio/Linkerd), and ingress controllers for secure and resilient workloads.
  • CI/CD & Automation – Architect CI/CD pipelines (ArgoCD/GitOps, Jenkins) and build custom deployment portals and automation tools to accelerate developer productivity.
  • AI/LLM Reliability – Define and track key metrics (latency, cost, throughput, containment) for AI/LLMs and agent workflows.
  • Observability & Tracing – Implement end-to-end tracing for multi-turn queries and real-time pipelines using OpenTelemetry, Prometheus, and Grafana.
  • Vector Databases – Manage and tune vector DBs (Pinecone, Weaviate, Milvus, etc.) for high concurrency, hybrid retrieval, reranking, and resilience.
  • Resilience & Scaling – Design autoscaling, failover, and health-check–based routing strategies for workloads like WebSockets, RAG pipelines, and voice (STT/TTS).
  • Scripting & Tooling – Write Bash/Python/Go scripts for operational tooling, log rotation, API integrations, and rollout automation.
  • Collaboration – Partner with AI and engineering teams to support complex workflows, while driving DevOps best practices across the organization

 

What You’ll Bring

  • 5–8 years of core DevOps experience with a strong track record of building infra from scratch (not just maintaining existing systems).
  • Deep expertise in Docker, Kubernetes, Helm, and container orchestration.
  • Hands-on with Terraform, Crossplane, and declarative infra management.
  • Strong experience in CI/CD pipelines (ArgoCD, Jenkins, GitOps workflows) and building custom automation.
  • Proven ability to deploy AI/LLMs & agent workflows reliably in production.
  • Expertise in defining/tracking AI workflow metrics and observability of multi-turn queries.
  • Mandatory expertise with vector databases – tuning, scaling, and optimizing retrieval performance.
  • Proficiency in monitoring & logging tools (Prometheus, Grafana, OpenTelemetry, ELK/OpenSearch).
  • Familiarity with service mesh (Istio/Linkerd), networking, and multi-cluster workloads.
  • Proficiency in scripting/programming (Python, Bash, Go preferred).
  • Knowledge of security best practices in cloud environments (IAM, secrets, secure networking).

Bonus Points:

  • Experience working on AI-enabled or ML-integrated platforms
  • Understanding of compliance, security, and auditability requirements in regulated environments
  • Prior experience working in fast-paced, high-growth product teams

Why Join Us?

  • Work on the most impactful and revenue-generating platform in the organization
  • Build infrastructure for real-time intelligent agents—across both chat and voice modalities
  • Collaborate with top-tier AI researchers, designers, and engineers in a fast-moving, high-trust environment
  • Help shape the future of human-AI interaction in a mission-driven company at scale

Ready to lead with impact? Apply now.


#LI-SS1 
#LI-Remote
 

At interface.ai, we are committed to providing an inclusive and welcoming environment for all employees and applicants. We celebrate diversity and believe it is critical to our success as a company. We do not  discriminate on the basis of race, color, religion, national origin, age, sex, gender identity, gender expression, sexual orientation, marital status, veteran status, disability status, or any other legally protected status. All employment decisions at Interface.ai are based on business needs, job requirements, and individual qualifications. We strive to create a culture that values and respects each person's unique perspective and contributions. We encourage all qualified individuals to apply for employment opportunities with Interface.ai and are committed to ensuring that our hiring process is inclusive and accessible.

Top Skills

Argocd
AWS
Azure
Bash
Docker
Elk
GCP
Go
Grafana
Helm
Istio
Jenkins
Kubernetes
Linkerd
Opentelemetry
Prometheus
Python
Terraform

Similar Jobs

2 Days Ago
Easy Apply
Remote
India
Easy Apply
Senior level
Senior level
Software • Consulting
The Senior DevOps Engineer will design cloud infrastructure strategies, implement CI/CD, and collaborate with global teams while managing production operations.
Top Skills: Akamai CdnAnsibleAWSAzureCi/CdCloudFormationDatadogDockerGithub ActionsGoGrafanaHelmJenkinsKubernetesPythonTerraform
9 Days Ago
In-Office or Remote
8 Locations
Mid level
Mid level
Artificial Intelligence • Machine Learning • Social Media • Software • App development
The DevOps Engineer will automate development and delivery processes, operate a high-load system on Kubernetes, and resolve incidents.
Top Skills: AnsibleAWSElk StackGCPGitlab Ci/CdGoHelmKubernetesMongoDBPrometheusPythonRabbitMQTerraform
16 Days Ago
Remote
India
Senior level
Senior level
Software
As a Senior DevOps Engineer, you'll optimize IT infrastructure, ensure operational stability, and mentor team members while managing complex environments.
Top Skills: AnsibleAWSAzureBashElk StackGCPGitopsGrafanaHyper-VKeycloakLinuxNagiosOpen TelemetryOpencostPowershellPrometheusPythonSccmVMwareWindows ServerWsus

What you need to know about the Mumbai Tech Scene

From haggling for the best price at Chor Bazaar to the bustle of Crawford Market, the energy of Mumbai's traditional markets is a key part of the city's charm. And while these markets will always have their place, the city also boasts a thriving e-commerce scene, ranking among the largest in the region. Driven by online sales in everything from snacks to licensed sports merchandise to children's apparel, the local industry is worth billions, with companies actively recruiting to meet the demands of continued growth.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account