Senior Site Reliability Engineer
Job Description
Join Arista Networks as a Senior Site Reliability Engineer (SRE)
Are you a seasoned engineer with a passion for building and operating scalable, resilient systems? Arista Networks, a leader in data-driven, cloud networking solutions, is seeking a talented Senior Site Reliability Engineer (SRE) to join our dynamic team.
About Arista Networks: Innovating the Future of Networking
At Arista, we’re not just building networks; we’re building the future. We empower organizations with cutting-edge, data-driven solutions that span large data centers, campus environments, and beyond. Our culture thrives on innovation, embracing cloud computing, artificial intelligence, and software-defined networking to deliver unparalleled value to our clients. We are recognized as a top employer, and have been recognized for Best Engineering Team, Best Company for Diversity, Compensation, and Work-Life Balance.
The Opportunity: Shape the Reliability of CloudVision
As a Senior SRE, you will play a pivotal role in ensuring the reliability, performance, and security of our global CloudVision service fleet. CloudVision is our enterprise network management and streaming telemetry SaaS offering, built on Kubernetes and deployed across global regions using Spinnaker for CI/CD.
What You’ll Do:
- Design, implement, and maintain the CI/CD lifecycle for CloudVision services.
- Drive operational efficiency through automation and proactive monitoring.
- Define and track key service indicators for effective capacity planning.
- Take ownership of disaster recovery and management strategies.
- Champion infrastructure and cloud-based application security design.
- Lead sustainable incident response and conduct blameless postmortems to improve service reliability.
- Participate in a globally distributed on-call rotation.
Our Tech Stack: Embrace Cutting-Edge Technologies
You’ll be working with a modern tech stack, including:
- GKE (Google Kubernetes Engine)
- HBase/Hadoop
- ElasticSearch
- ClickHouse
- Kafka-based distributed real-time stream processing
- TensorFlow
- Prometheus, Grafana, Loki
Qualifications: Your Expertise
We’re looking for individuals with:
- BS/MS degree in Computer Science or equivalent experience.
- 5+ years of software engineering experience.
- Experience developing or managing deployments of distributed database systems or scale-out applications for a SaaS environment.
- A strong passion for Site Reliability Engineering (SRE) principles.
Compensation and Benefits
The base pay range for this role is $101,000 to $161,000. The actual base pay offered will be based on a wide range of factors, including skills, qualifications, relevant experience, and work location. In addition to base pay, certain roles may be eligible for discretionary Arista bonuses and equity. We also offer a comprehensive benefits package, including medical, dental, vision, wellbeing, tax savings and income protection. Our recruiting team can share more details during the hiring process specific to the role and location.
Join Our Inclusive Team
Arista Networks is an equal opportunity employer. We value diversity and are committed to creating an inclusive environment where everyone can thrive.
If you’re ready to make a significant impact on the future of networking, we encourage you to apply!
#LI-GR1
“