Site Reliability Engineer
Job Description
Shape the Future of Real-Time Connectivity at PubNub as a Site Reliability Engineer
Are you passionate about ensuring the reliability and performance of systems that power real-time experiences for millions worldwide? PubNub, the leading platform for real-time data streaming, is seeking a talented and motivated Site Reliability Engineer (SRE) to join our dynamic team. If you’re looking to expand your skills, contribute to a global platform, and thrive in a remote-first environment, this is your opportunity!
About PubNub: Pioneering Real-Time Innovation
PubNub empowers developers and businesses to build engaging real-time applications. Trusted by over 2,000 companies, including industry leaders like Verizon, Autodesk, and DropBox, we provide a robust and scalable platform for chat, IoT, live updates, and interactive experiences. Join us and be part of a team that’s revolutionizing how the world connects and interacts digitally. We offer a collaborative and supportive remote-first work environment.
Your Mission as a Site Reliability Engineer
As an SRE at PubNub, you’ll play a critical role in maintaining and improving the availability, performance, and scalability of our global real-time data streaming network. You’ll evolve from executing operational tasks to actively shaping system design and driving reliability enhancements. You’ll collaborate with senior engineers and cross-functional teams, gaining invaluable hands-on experience with high-performance systems processing billions of requests per minute.
This role provides a fantastic opportunity to build a strong foundation in SRE principles across the entire technology stack โ from infrastructure and automation to incident leadership and business impact analysis. You’ll learn to balance immediate operational needs with long-term strategic improvements, developing the holistic system thinking essential for successful Site Reliability Engineers.
Key Responsibilities
- Support the design, maintenance, and continuous improvement of highly available systems scaling to 100 million concurrent connections.
- Operate global infrastructure across 15+ data centers using Infrastructure as Code tools (Terraform, Kubernetes, ArgoCD).
- Utilize observability tools (VictoriaMetrics, Grafana, Loki) to monitor system health, identify performance bottlenecks, and recommend optimizations.
- Create and maintain technical documentation, runbooks, and contribute to knowledge sharing within the team.
- Collaborate with service architects and developers to evaluate and implement new technologies improving stability and performance.
- Participate in incident response efforts, conduct post-incident reviews, and perform root cause analysis.
- Develop automation scripts to reduce operational toil and improve infrastructure management efficiency.
What You Bring to the Table
We’re looking for someone with a passion for reliability and a desire to learn and grow.
Essential Qualifications:
- 1-4 years of Site Reliability Engineering experience (or equivalent in DevOps, Infrastructure Engineering, or Platform Engineering).
- Proficiency in container technology with Docker fundamentals and basic Kubernetes operations.
- Practical experience with at least one major cloud platform (AWS preferred; GCP or Azure acceptable).
- Basic understanding of monitoring and observability concepts with experience using tools like VictoriaMetrics, Grafana, and Loki.
- A commitment to continuous learning and professional development.
Bonus Points:
- Experience with Infrastructure as Code practices and CI/CD concepts (Terraform, ArgoCD, GitOps).
- Experience with incident response and troubleshooting.
- Automation experience using Python and/or Bash.
- Solid problem-solving skills and attention to detail.
- Collaborative mindset and understanding of business impact.
Why Choose PubNub?
At PubNub, you’ll be part of a company that’s powering the future of real-time communication. We offer a supportive, collaborative, and remote-first culture where you can:
- Make a meaningful impact on a global scale.
- Expand your technical skills and expertise.
- Contribute to innovative, high-impact technology.
- Enjoy competitive compensation and benefits, including a salary range of PLN 14,000 to 20,300 per month on a B2B contract.
We are an Equal Employment Opportunity (EEO) employer committed to diversity and inclusion. Join us and help create a more connected future!
“