Great Question

Site Reliability Engineer

Posted 12 Days Ago

Remote

Hiring Remotely in United States

Senior level

Remote

Hiring Remotely in United States

Senior level

As a Site Reliability Engineer, you will own platform health, improve observability, manage cloud costs, and enhance developer experience in a dynamic startup environment.

The summary above was generated by AI

🚀 About Us

We’re a product-focused startup with a tight-knit team of 14 engineers building tools that help teams make better decisions through great research. We're pragmatic, fast-moving, and obsessed with product quality.

As we grow, our infrastructure needs to grow with us. That means better observability, stronger systems, faster deploys—and smarter decisions about cloud spend. We’re hiring someone who can take ownership of this and lay the foundation for long-term platform health.

🎯 What You’ll Do

You’ll be the first dedicated DevOps/Infra hire with end-to-end ownership of platform health, reliability, and scalability. You’ll partner directly with our engineering team to improve our systems, reduce toil, and make infra a product in its own right.

Your scope will include:

Observability, Reliability, Availability
- Define and maintain service SLOs, dashboards, and alerts
- Improve incident detection and response
- Lead incident postmortems, share learnings, and manage follow-up actions
Infrastructure
- Maintain and improve Terraform-managed infrastructure
- Lead our migration of staging infrastructure to AWS
- Optimize our use of tools like Datadog, Sentry, and others
Capacity Planning & Performance Optimization
- Identify current and potential future bottlenecks
- Collaborate with engineers to fine-tune application and infrastructure performance
- Implement automated and semi-automated scaling strategies to handle growth and evolving workloads
Developer Experience & CI/CD
- Increase pipeline reliability and performance
- Design & implement load testing strategies as we scale
Security & Compliance
- Work with the CTO in owning and implementing SOC2 compliance protocols and requirements
- Help foster a security-first culture by promoting best practices and secure-by-default tooling
- Implement guardrails and additional security tools as needed
Cloud Cost Management
- Monitor and optimize cloud spend
- Build visibility and tooling to help teams make cost-aware decisions

💡 You Might Be a Great Fit If You...

Have 4–8+ years of experience in DevOps, SRE, or Infrastructure roles
Have hands-on AWS experience (EC2, RDS, VPCs, etc.)
Are confident with Terraform, GitHub Actions, Docker, and PostgreSQL
Have a track record of improving observability and reducing incident response times
Have worked in high-autonomy, high-ownership environments
Are cost-conscious and can identify waste in infra and cloud spend
Love building leverage tools for engineers—infra as a product

📈 Growth Path

This is a foundational hire. Today, the role is fully IC, but there’s clear runway to grow into:

Platform leadership (tech lead or manager)
Head of Infra/SRE if we expand the team
Principal engineer focused on scale, reliability, and platform strategy

You’ll have support and visibility from leadership, and the freedom to chart your path as the company grows.

⚙️ Our Stack

Cloud: AWS
Infra-as-code: Terraform
CI/CD: GitHub Actions
Containers: Docker, lightweight Kubernetes
Monitoring: Datadog, Sentry
Database: PostgreSQL, Redis
App: Rails, React, Sidekiq

✨ Why This Role?

Impact: You’ll shape the systems and culture of how we build and run software.
Trust: High autonomy and low process—make smart decisions, move fast.
People: No egos, just a team that values thoughtfulness, speed, and care.
Growth: Opportunity to grow with the company in whichever direction excites you.

Top Skills

AWS

Docker

Github Actions

Postgres

Ruby on Rails

React

Redis

Sidekiq

Terraform

Similar Jobs

Dropbox

Site Reliability Engineer

Yesterday

Remote

United States

217K-293K Annually

Senior level

217K-293K Annually

Senior level

Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy

The role involves designing scalable infrastructure, automating processes, managing reliability efforts, and collaborating with engineering teams to enhance reliability and performance.

Top Skills: AlgorithmsCodingData StructuresSoftware Development

ServiceNow

Machine Learning Engineer

9 Days Ago

Remote or Hybrid

Santa Clara, CA, USA

Senior level

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation

The role involves designing and implementing infrastructure for AI workloads, improving Site Reliability Engineering practices, and mentoring team members.

Top Skills: AnsibleDockerGitlab CiGoHelmJ2EeJavaKubernetesLinuxPrometheusPythonSplunk

Site Reliability Engineer

6 Days Ago

Remote

United States

140K-210K Annually

Senior level

140K-210K Annually

Senior level

Sales • Software • Automation

The Site Reliability Engineer will build and maintain infrastructure for a CRM platform, ensuring stability and automation. Responsibilities include handling databases, CI/CD, and improving disaster recovery systems.

Top Skills: AnsibleArgocdAWSCircleCIClickhouseDnsDockerEksElasticcacheElasticsearchFlaskGithub ActionsGrafanaHTTPKubernetesLokiMimirMongoDBMskPostgresPrometheusPythonRdsRedisTcpTempoTerraform

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus