Worldwide Remote Jobs

Site Reliability Engineer – Core C++ Team

ClickHouse
📍 Canada 💼 full_time
Apply Now đź“… 2 days ago

Job Description

Site Reliability Engineer (SRE) – ClickHouse Core

About ClickHouse: Fueling Fast Analytics at Scale

Join the forefront of high-performance data analysis. Since 2009, ClickHouse has been synonymous with speed in the world of OLAP databases. Our open-source, column-oriented database system is built to empower users with blazingly fast, real-time analytical reports directly via SQL. We handle massive data volumes effortlessly, enabling critical insights for leading enterprises globally, including Lyft, Sony, IBM, GitLab, and HubSpot. Whether deployed via open-source or our managed ClickHouse Cloud service on platforms like AWS, GCP, and Azure, ClickHouse delivers unparalleled speed and efficiency.

Note: This position is fully remote and open to candidates based in any country where ClickHouse has a hiring presence.

The Opportunity: Shape Reliability for the Core Database

At ClickHouse, ensuring our customers experience reliable, secure, and lightning-fast service is paramount. We are expanding our dedicated Site Reliability Engineering (SRE) team focused on the core database engine: ClickHouse Core. As one of the foundational members of this critical team, you will play a pivotal role in shaping our approach to operational excellence. You will be instrumental in building and refining the processes that guarantee the reliability, availability, scalability, and peak performance of the ClickHouse database engine itself. This isn’t just about monitoring; you’ll collaborate closely with engineering teams (Control Plane, Dataplane, Security) and operations to implement best practices, own key operational areas like incident management, escalation response, and post-mortem analysis, and drive continuous improvement. This is your chance to make a tangible, high-impact contribution to our elastic, limitless scale, high-performance ClickHouse Cloud offering.

What You’ll Be Doing: Key Responsibilities

You will be at the heart of ensuring ClickHouse Core runs smoothly and reliably for users worldwide. Your responsibilities will include:

  • Continuously drive initiatives to improve the reliability and performance of the ClickHouse core database engine.
  • Develop and enhance comprehensive monitoring and alerting systems to proactively identify and prevent production issues before they impact customers.
  • Deep dive into complex customer-reported problems and production incidents within ClickHouse Core to perform root cause analysis, submit fixes, report issues, and propose fundamental improvements.
  • Refine and optimize incident response processes and post-mortem analysis frameworks for ClickHouse Core outages, collaborating with Support and Cloud teams for customer communication.
  • Plan, champion, and execute Chaos Engineering experiments across engineering teams to build more resilient systems.
  • Manage and evolve on-call processes, establishing best practices for coordinating escalation management to quickly resolve issues and minimize customer impact.

What You’ll Bring: Skills & Experience

To succeed in this high-impact role, you should have:

  • Bachelor’s or Master’s degree in Computer Science or a related technical field.
  • At least 8 years of progressive experience in Reliability Engineering, Quality Assurance, or a customer-facing engineering role with a strong operational focus.
  • Proven hands-on experience operating ClickHouse or other complex SQL databases in a production environment.
  • Excellent understanding of distributed database internals and SQL, with specific knowledge of ClickHouse being a significant advantage.
  • Solid scripting skills (Shell or Python) and the ability to read and understand C++ code used in database systems.
  • Practical knowledge of major cloud platforms such as AWS, Azure, or GCP.
  • A reputation as a strong problem-solver with exceptional production debugging capabilities.
  • Ability to thrive and contribute effectively in a fast-paced, globally distributed team environment.
  • High degree of responsibility, ownership, and accountability for your work and the systems you support.
  • Outstanding communication and collaboration skills.

Why Join the ClickHouse Team?

Be part of a dynamic, global team shaping the future of fast analytics. We are a fully distributed company operating in over 20 countries, offering a truly flexible work environment. This position is remote-friendly, open in any country where ClickHouse has a hiring presence.

As one of our early team members, you’ll have a significant voice in defining our evolving company culture. We offer competitive compensation, including employee equity via stock options, contributions towards healthcare benefits, flexible time off, and a home office setup stipend for remote employees. We also value in-person connection and offer opportunities for global team gatherings.

Compensation & Benefits

Compensation for this role is competitive and varies based on location, experience, qualifications, and business needs. Specific ranges for the United States are discussed during the hiring process. For any compensation-related inquiries, please contact us at paytransparency@clickhouse.com.

Equal Opportunity & Privacy

ClickHouse is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate based on race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. For details on our Privacy Statement, please refer to the provided link.

Latest Jobs

Similar Jobs

Magic Leap
📍 USA 💼 full_time 📅 Jun 29, 2025
Partnerize
📍 UK 💼 full_time 📅 Jun 28, 2025
PriceSpider
📍 USA 💼 full_time 📅 Jun 28, 2025
Qualia
📍 USA 💼 full_time 📅 Jun 28, 2025