Senior DevOps Engineer
Zoom ·careers.zoom.com
Apply directExcited to grow your career?
We value our talented employees, and whenever possible strive to help one of our associates grow professionally before recruiting new talent to our open positions. If you think the open position you see is right for you, we encourage you to apply!
Our people make all the difference in our success.
What you can expect
We are looking for a highly skilled Senior DevOps with deep expertise in Kubernetes join our infrastructure team. You will be responsible for designing, deploying, and managing high-availability Kubernetes clusters with over 1000+ nodes in cloud and on-premise (datacenter) environments. You will work closely with cross-functional teams to enhance automation, improve observability, and ensure the high availability of mission-critical systems.
About the Team
With eight specialized departments, the engineering team functions as a highly collaborative, diverse powerhouse. Each department mission is to deliver seamless and innovative communication solutions. We are a team of infrastructure engineers primarily focused on the implementation and management of Kubernetes clusters and related infrastructure across various cloud environments and colocation facilities. The key responsibilities of the team include, but are not limited to, maintaining high availability of the infrastructure and designing and developing automation to improve operational efficiency.
What we’re looking for
Have 5+ years of experience in DevOps, or SRE roles
Be able to participate in on-call shifts and incident management and work after hours/weekends for application releases/deployments
Show deep expertise in Kubernetes clusters deployment, upgrades and migration from cloud to on-premise
Have experience in designing and implementing high availability cloud (AWS, Azure, OCI) and on-premise environments while ensuring security, scalability and reliability
Have in-depth knowledge in implementation and management of Kubernetes design patterns, including Istio, Operators, Kubernetes CNIs, API Gateways, autoscaling and backup solutions.
Show proficiency in Kubernetes infrastructure provisioning using Terraform, Ansible, and application deployment using Helm, kustomize etc
Have experience setting up monitoring, logging, and alerting to ensure Kubernetes cluster health and performance
Be able to troubleshoot Kubernetes issues including container runtimes (Docker, containerd), CNI, cluster Addon and Microservices level issues
Automate pipelines, design and develop Kubernetes Operators to streamline Kubernetes operations; deploy storage solutions (e.g., Ceph, EBS, NFS) for Kubernetes Platform