As an Sr Advanced Software Engineer at Honeywell, you will help develop and implement advanced software solutions for process automation. Your work will support the design, development, and deployment of technologies that improve operational efficiency and productivity.
You will report to the Engineering Manager and work from our Bangalore location on a hybrid schedule.
In this role, you will impact the automation of complex industrial processes by creating scalable, reliable, and efficient software systems that support Honeywell’s mission to deliver advanced technological solutions to our customers.
At Honeywell, our people leaders play a critical role in developing and supporting our employees to help them perform at their best and drive change across the company. Help to build a strong, diverse team by recruiting talent, identifying, and developing successors, driving retention and engagement, and fostering an inclusive culture
ResponsibilitiesKEY RESPONSIBILITIES
- Ensure reliability, availability, and performance of SaaS platform by managing production systems and driving improvements in uptime and resiliency.
- Monitor, analyze, and troubleshoot production environments, leveraging observability tools (logs, metrics, alerts) to proactively detect and resolve issues.
- Lead incident management and root cause analysis (RCA) activities, including participating in on-call rotations, driving postmortems, and implementing corrective actions to prevent recurrence.
- Design and implement monitoring, alerting, and SLO/SLI frameworks to continuously improve system reliability and user experience.
- Automate operational processes and reduce toil by developing scripts and infrastructure-as-code solutions using tools such as Terraform, Python, or Bash.
- Support and optimize cloud infrastructure (AWS/Azure) for scalability, performance, and cost efficiency, including compute, storage, networking, and load balancing components.
- Deploy, manage, and troubleshoot containerized applications using Docker and Kubernetes-based platforms.
- Collaborate with development, QA, and product teams to improve application reliability, release quality, and production readiness.
- Drive CI/CD pipeline improvements to ensure faster, stable, and secure deployments across environments.
- Support integrations between Salesforce platform and cloud services (CSP/AWS), ensuring reliable data flow and API performance across distributed systems.
- Implement and maintain security best practices across applications and infrastructure, including:
- Secure configurations
- Data protection (encryption, secrets management)
- Vulnerability remediation and compliance requirements
- Plan and execute disaster recovery (DR) and backup/restore strategies to ensure business continuity and system resilience.
- Conduct capacity planning and performance tuning to support growing workloads and enterprise-scale usage.
- Work in a global team environment, supporting deployments and operations across multiple regions and customer environments
YOU MUST HAVE
- Strong experience in Site Reliability Engineering (SRE), Production Support, or Platform Engineering for large-scale, cloud-based SaaS applications.
- Hands-on experience with cloud platforms (AWS preferred, Azure acceptable) including provisioning, monitoring, and troubleshooting distributed systems.
- Experience managing and supporting high-availability, fault-tolerant systems with a strong focus on uptime, reliability, and scalability.
- Strong knowledge of containerization and orchestration technologies:
Docker / Kubernetes (deployment, scaling, troubleshooting)
- Experience with CI/CD pipelines and DevOps practices using tools such as Jenkins, Bamboo, GitHub Actions, Octopus, etc.
- Strong scripting and automation skills using Terraform, Bash, Python, or PowerShell.
- Solid experience with Linux/Unix systems administration and understanding of networking fundamentals (TCP/IP, DNS, Load Balancing).
- Hands-on experience with monitoring and observability tools (e.g., Datadog, Prometheus, Grafana, Splunk or similar).
- Strong experience with log analysis, metrics, alerting, and distributed tracing.
- Proven experience in incident management, root cause analysis (RCA), and postmortems for production issues.
- Experience managing SLIs, SLOs, and error budgets to drive system reliability.
- Ability to troubleshoot complex issues across applications, infrastructure, and network layers.
WE VALUE
- Experience working in a 24/7 production support model or on-call rotations.
- Exposure to SRE best practices (automation-first mindset, toil reduction).
- Experience collaborating with global teams (US, India, EMEA).
- Strong communication skills with the ability to guide developers on reliability improvements and production readiness.
- Experience with performance tuning, capacity planning, and cost optimization in cloud environments.
- BE/B.Tech/ or higher in Information Technology / Computer Science Engineering.
- Certification of Cloud Platform and Infrastructure (AWS/Azure)
- Certification of DevOps tools.



