Worldwide Remote Jobs

Site Reliability Engineer

Iodine Software
📍 USA 💼 full_time
Apply Now 📅 4 days ago

Job Description

Ready to Engineer the Future of Healthcare?

Join us in transforming patient care through innovative clinical AI. At Iodine, your work as a Site Reliability Engineer (SRE) won’t just build infrastructure – it will directly impact healthcare outcomes, enabling smarter processes and empowering clinicians.

Who We Are: Revolutionizing Healthcare with AI

Recognized as a premier workplace in Austin, Iodine is a collaborative, innovation-driven enterprise AI company. We’re challenging conventional approaches to healthcare by automating complex clinical workflows, generating actionable insights, and empowering intelligent care delivery. Our groundbreaking Cognitive ML engine, fueled by one of the industry’s largest clinical datasets, provides real-time, predictive insights that dramatically enhance patient care management.

The Opportunity: Site Reliability Engineer – AWS

We are seeking a highly skilled and passionate Site Reliability Engineer with deep expertise in AWS Cloud. In this critical role, you will be instrumental in designing, building, and optimizing our cutting-edge cloud platform and infrastructure. You’ll leverage your extensive AWS knowledge across compute, storage, databases, networking, and security, with a keen focus on cost optimization and operational excellence. You’ll drive the evolution of our cloud strategy, design massively scalable solutions, and ensure the reliability, security, and efficiency of our platform as we grow.

Shape Our Cloud Future

Implement our cloud strategy and roadmap, focusing on driving scalability, reliability, security, and cost efficiency. Lead cloud adoption initiatives and champion governance, architectural best practices, and modernization efforts.

Build & Automate for Scale

Develop robust Infrastructure as Code (IaC) using tools like Terraform or AWS CDK to achieve fully automated provisioning and deployment. Own and enhance infrastructure CI/CD pipelines utilizing Gitlab, Ansible (AWX), Argo CD, and Helm. Design and implement self-healing, fault-tolerant architectures.

Ensure Operational Excellence

Optimize infrastructure monitoring and observability using tools such as Prometheus, Grafana, Loki, Tempo, Mimir, AWS CloudWatch, AWS CloudTrail, and New Relic. Conduct regular system maintenance, including OS patching and upgrades for core services like RDS and Kubernetes (EKS) clusters. Troubleshoot and resolve infrastructure and application issues, collaborating across teams and supporting customer escalations. Lead incident management processes, define SLOs/SLIs, and conduct post-mortems for continuous improvement.

Champion Security & Reliability

Embed stringent cloud security best practices across all solutions, focusing on IAM policies, VPC security, encryption, and compliance standards like SOC 2 and HIPAA. Implement least privilege access, network segmentation, and automated security controls. Collaborate closely with InfoSec on threat detection, logging, and security monitoring using AWS GuardDuty, Security Hub, and CloudTrail. Design and execute multi-region Disaster Recovery (DR) and backup strategies.

Drive Cost Efficiency (FinOps)

Proactively monitor and optimize AWS infrastructure costs using tools like AWS Cost Explorer and Trusted Advisor, leveraging Savings Plans/Reserved Instances. Foster a FinOps culture, promoting cost-aware design and deployment. Implement auto-scaling, rightsizing, and storage lifecycle policies to reduce expenses.

Collaborate and Architect

Participate in key architectural discussions with product engineering teams to ensure new and existing services adhere to best practices for scalability, cost-efficiency, and operational excellence. Collaborate with software developers to optimize application performance and cloud-native designs. Design and build highly available, scalable, and fault-tolerant AWS architectures utilizing a wide array of services (EC2, S3, RDS, DocumentDB, Lambda, EKS, Secrets Manager, SSM, API Gateway, CloudFront) and related technologies (HashiCorp Terraform, Vault, Consul, Ansible).

What You’ll Bring: Skills & Experience

  • Minimum 5+ years of hands-on experience in SRE or DevOps roles, specifically within the AWS ecosystem.
  • Deep practical expertise with core AWS services including EC2, S3, Lambda, EKS, VPC, IAM, Secrets Manager, and SSM, alongside technologies like HashiCorp Vault and Consul.
  • Proven ability to implement significant cost optimization techniques in AWS (e.g., autoscaling, right-sizing, RIs/Savings Plans).
  • Strong command of Infrastructure as Code (IaC) using Terraform, CloudFormation, or AWS CDK.
  • Proficiency in Linux administration and scripting (Python, Bash).
  • Extensive experience with containerization and orchestration, particularly Kubernetes (EKS) and Docker.
  • Solid understanding and practical application of cloud security concepts (IAM, security groups, WAF, CloudTrail) and compliance frameworks.
  • Experience with modern monitoring and observability tools (Prometheus, Grafana, CloudWatch, Loki, New Relic).
  • Familiarity with code review processes for infrastructure code quality.
  • Excellent collaboration, documentation, and communication skills.
  • Ability to travel to headquarters for mandatory meetings/onboarding.

Bonus Points (Nice to Have):

  • Relevant AWS Certifications (e.g., Professional level).
  • Experience managing multi-account AWS environments or AWS Control Tower.
  • Familiarity with service meshes (Istio, Linkerd) or API gateways.
  • Experience with network security appliances (FortiGate) or advanced AWS networking (Transit Gateway, Direct Connect).
  • Background in database administration (PostgreSQL, MySQL, DocumentDB, NoSQL).
  • Experience with resilience testing or chaos engineering.

Why This Role is Unique:

  • Influence and lead the AWS cloud strategy and architecture for a rapidly growing healthcare tech company.
  • Work on cutting-edge cloud technologies in a high-impact environment.
  • Drive innovation and best practices in SRE and cloud engineering.
  • Make a tangible difference by optimizing our cloud costs and operational efficiency.
  • Collaborate closely with talented engineering and product teams.

Why Join Iodine? Our Mission & Culture

Become part of a dedicated, passionate, and ambitious team with a proven track record. At Iodine, you’ll contribute directly to improving healthcare processes through technology, allowing hospitals to focus on what matters most: patient care. We offer a unique opportunity to join a close-knit team during an exciting growth phase. Discover more about our culture on Built In Austin and our website.

Comprehensive Benefits & Perks:

We invest in our employees’ well-being and professional growth:

  • Health & Wellness: Fully covered medical, vision, and dental for employees (generous dependent coverage), Telehealth, HSAs/FSAs, Life, AD&D, Disability insurance.
  • Financial Future: Competitive 401(k) with significant company match.
  • Added Protection: Optional supplemental insurance (Life, Accident, Critical Illness, Hospital Indemnity).
  • Work-Life Support: Pet Insurance, Legal & ID Protection, Employee Assistance Program (EAP).
  • Professional Development: Annual education allowance.
  • Home Office & Wellness Support: Annual wellness reimbursement, monthly phone/internet reimbursement, one-time equipment allowance.

Latest Jobs

Similar Jobs

ClickHouse
📍 Canada 💼 full_time 📅 Jun 29, 2025
Magic Leap
📍 USA 💼 full_time 📅 Jun 29, 2025
Partnerize
📍 UK 💼 full_time 📅 Jun 28, 2025
PriceSpider
📍 USA 💼 full_time 📅 Jun 28, 2025