Site Reliability Engineer

Iodine Software

📍 USA 💼 full_time

Apply Now 📅 4 days ago

🧠 AWS kubernetes python

Job Description

Ready to Engineer the Future of Healthcare?

Join us in transforming patient care through innovative clinical AI. At Iodine, your work as a Site Reliability Engineer (SRE) won’t just build infrastructure – it will directly impact healthcare outcomes, enabling smarter processes and empowering clinicians.

Who We Are: Revolutionizing Healthcare with AI

Recognized as a premier workplace in Austin, Iodine is a collaborative, innovation-driven enterprise AI company. We’re challenging conventional approaches to healthcare by automating complex clinical workflows, generating actionable insights, and empowering intelligent care delivery. Our groundbreaking Cognitive ML engine, fueled by one of the industry’s largest clinical datasets, provides real-time, predictive insights that dramatically enhance patient care management.

The Opportunity: Site Reliability Engineer – AWS

We are seeking a highly skilled and passionate Site Reliability Engineer with deep expertise in AWS Cloud. In this critical role, you will be instrumental in designing, building, and optimizing our cutting-edge cloud platform and infrastructure. You’ll leverage your extensive AWS knowledge across compute, storage, databases, networking, and security, with a keen focus on cost optimization and operational excellence. You’ll drive the evolution of our cloud strategy, design massively scalable solutions, and ensure the reliability, security, and efficiency of our platform as we grow.

Shape Our Cloud Future

Implement our cloud strategy and roadmap, focusing on driving scalability, reliability, security, and cost efficiency. Lead cloud adoption initiatives and champion governance, architectural best practices, and modernization efforts.

Build & Automate for Scale

Develop robust Infrastructure as Code (IaC) using tools like Terraform or AWS CDK to achieve fully automated provisioning and deployment. Own and enhance infrastructure CI/CD pipelines utilizing Gitlab, Ansible (AWX), Argo CD, and Helm. Design and implement self-healing, fault-tolerant architectures.

Ensure Operational Excellence

Optimize infrastructure monitoring and observability using tools such as Prometheus, Grafana, Loki, Tempo, Mimir, AWS CloudWatch, AWS CloudTrail, and New Relic. Conduct regular system maintenance, including OS patching and upgrades for core services like RDS and Kubernetes (EKS) clusters. Troubleshoot and resolve infrastructure and application issues, collaborating across teams and supporting customer escalations. Lead incident management processes, define SLOs/SLIs, and conduct post-mortems for continuous improvement.

Champion Security & Reliability

Embed stringent cloud security best practices across all solutions, focusing on IAM policies, VPC security, encryption, and compliance standards like SOC 2 and HIPAA. Implement least privilege access, network segmentation, and automated security controls. Collaborate closely with InfoSec on threat detection, logging, and security monitoring using AWS GuardDuty, Security Hub, and CloudTrail. Design and execute multi-region Disaster Recovery (DR) and backup strategies.

Drive Cost Efficiency (FinOps)

Proactively monitor and optimize AWS infrastructure costs using tools like AWS Cost Explorer and Trusted Advisor, leveraging Savings Plans/Reserved Instances. Foster a FinOps culture, promoting cost-aware design and deployment. Implement auto-scaling, rightsizing, and storage lifecycle policies to reduce expenses.

Collaborate and Architect

Participate in key architectural discussions with product engineering teams to ensure new and existing services adhere to best practices for scalability, cost-efficiency, and operational excellence. Collaborate with software developers to optimize application performance and cloud-native designs. Design and build highly available, scalable, and fault-tolerant AWS architectures utilizing a wide array of services (EC2, S3, RDS, DocumentDB, Lambda, EKS, Secrets Manager, SSM, API Gateway, CloudFront) and related technologies (HashiCorp Terraform, Vault, Consul, Ansible).

What You’ll Bring: Skills & Experience

Minimum 5+ years of hands-on experience in SRE or DevOps roles, specifically within the AWS ecosystem.
Deep practical expertise with core AWS services including EC2, S3, Lambda, EKS, VPC, IAM, Secrets Manager, and SSM, alongside technologies like HashiCorp Vault and Consul.
Proven ability to implement significant cost optimization techniques in AWS (e.g., autoscaling, right-sizing, RIs/Savings Plans).
Strong command of Infrastructure as Code (IaC) using Terraform, CloudFormation, or AWS CDK.
Proficiency in Linux administration and scripting (Python, Bash).
Extensive experience with containerization and orchestration, particularly Kubernetes (EKS) and Docker.
Solid understanding and practical application of cloud security concepts (IAM, security groups, WAF, CloudTrail) and compliance frameworks.
Experience with modern monitoring and observability tools (Prometheus, Grafana, CloudWatch, Loki, New Relic).
Familiarity with code review processes for infrastructure code quality.
Excellent collaboration, documentation, and communication skills.
Ability to travel to headquarters for mandatory meetings/onboarding.

Bonus Points (Nice to Have):

Relevant AWS Certifications (e.g., Professional level).
Experience managing multi-account AWS environments or AWS Control Tower.
Familiarity with service meshes (Istio, Linkerd) or API gateways.
Experience with network security appliances (FortiGate) or advanced AWS networking (Transit Gateway, Direct Connect).
Background in database administration (PostgreSQL, MySQL, DocumentDB, NoSQL).
Experience with resilience testing or chaos engineering.

Why This Role is Unique:

Influence and lead the AWS cloud strategy and architecture for a rapidly growing healthcare tech company.
Work on cutting-edge cloud technologies in a high-impact environment.
Drive innovation and best practices in SRE and cloud engineering.
Make a tangible difference by optimizing our cloud costs and operational efficiency.
Collaborate closely with talented engineering and product teams.

Why Join Iodine? Our Mission & Culture

Become part of a dedicated, passionate, and ambitious team with a proven track record. At Iodine, you’ll contribute directly to improving healthcare processes through technology, allowing hospitals to focus on what matters most: patient care. We offer a unique opportunity to join a close-knit team during an exciting growth phase. Discover more about our culture on Built In Austin and our website.

Comprehensive Benefits & Perks:

We invest in our employees’ well-being and professional growth:

Health & Wellness: Fully covered medical, vision, and dental for employees (generous dependent coverage), Telehealth, HSAs/FSAs, Life, AD&D, Disability insurance.
Financial Future: Competitive 401(k) with significant company match.
Added Protection: Optional supplemental insurance (Life, Accident, Critical Illness, Hospital Indemnity).
Work-Life Support: Pet Insurance, Legal & ID Protection, Employee Assistance Program (EAP).
Professional Development: Annual education allowance.
Home Office & Wellness Support: Annual wellness reimbursement, monthly phone/internet reimbursement, one-time equipment allowance.

“

Site Reliability Engineer

Job Description

Ready to Engineer the Future of Healthcare?

Who We Are: Revolutionizing Healthcare with AI

The Opportunity: Site Reliability Engineer – AWS

Shape Our Cloud Future

Build & Automate for Scale

Ensure Operational Excellence

Champion Security & Reliability

Drive Cost Efficiency (FinOps)

Collaborate and Architect

What You’ll Bring: Skills & Experience

Bonus Points (Nice to Have):

Why This Role is Unique:

Why Join Iodine? Our Mission & Culture

Comprehensive Benefits & Perks:

Want more remote jobs?UNLOCK 45,990 jobs!

Latest Jobs

Similar Jobs

Site Reliability Engineer – Core C++ Team

Staff Network Reliability Engineer

Senior Linux/SysAdmin Engineer

DevOps Engineer III

Find 100% remote jobs from anywhere in the world, Best for digital nomads and remote workers. Whether you want full-time, part-time, or contract work, you can work from any where you choose. We currently have latest and updated remote job listing. Start your search today!

Jobs by Country

Jobs by Position Type

Jobs by Region

Jobs by Skill

Jobs by Category

Sign up for email job alerts

Thank you for sign up!