The Site Reliability Engineer will ensure system reliability and performance, design scalable architectures, improve CI/CD pipelines, maintain infrastructures, and lead incident response efforts.
About PushPress
PushPress is building the Intelligent Industry Ledger for boutique fitness.
We’re transforming how boutique gyms operate — and how the entire $100B fitness industry connects, transacts, and grows. Trusted by 5,000+ gyms and 500,000+ members, PushPress processes over $500M annually and is backed by Altos Ventures and Mucker Capital.
We're evolving from a traditional business system of record into an AI-powered Industry Ledger — an intelligent infrastructure layer that brings order to a highly fragmented boutique fitness industry. By unifying disconnected operators, workflows, and data into a single platform, we’re enabling faster decisions, new business models, cross-gym collaboration, and network effects that increase the value of every studio in our client base.
We’re a global team of builders, operators, and fitness fanatics on a mission to level the playing field for fitness entrepreneurs. If you're ready to help reshape an industry — let’s talk.
About the Role
We're seeking a Site Reliability Engineer to own the reliability and performance of systems that power 5,000+ gyms daily, process a billion dollars in payments annually, and handle 5 million class check-ins every month. This is a critical role where you'll be responsible for infrastructure that directly impacts thousands of businesses and millions of their members. You'll work with modern technologies including AWS, Kubernetes, ArgoCD, GitHub Actions, and Terraform to build and maintain highly available, scalable systems. This is an opportunity to join during a high-growth phase where you'll have significant influence over our reliability practices, infrastructure architecture, and operational excellence standards. Our ideal candidate embodies a strong ownership mindset, is highly cross-functional, adaptable, and thinks beyond the conventional boundaries of traditional SRE work.
What You'll Do
- Ensure the reliability, performance, and availability of PushPress's production systems.
- Design and implement scalable, fault-tolerant, and efficient architectures on AWS using Kubernetes and Terraform.
- Own and continuously improve our CI/CD pipeline using GitHub Actions and ArgoCD with the goal of fast, reliable, and secure deployments.
- Maintain and optimize our developer and test infrastructure to enable efficient software development and testing processes.
- Develop comprehensive monitoring, logging, and alerting systems to proactively identify and resolve issues before they impact our customers.
- Lead incident response efforts and conduct thorough post-mortems to prevent future occurrences.
- Partner with engineering teams to build reliability into new features and services from day one.
- Continuously optimize our infrastructure costs while maintaining high performance and reliability at scale.
What You Need
- A minimum of 3 years of experience in Site Reliability Engineering, designing and managing large-scale, distributed systems on AWS.
- Proficiency in one or more programming languages, such as Python, Go, or JavaScript.
- Deep knowledge of Kubernetes, Terraform, and GitOps practices with ArgoCD.
- Experience building and maintaining CI/CD pipelines using GitHub Actions or similar tools.
- Strong infrastructure as code experience with Terraform in production environments.
- Experience with modern observability tools like Datadog, Prometheus, or similar monitoring platforms.
- Familiarity with containerization technologies like Docker and container orchestration at scale.
- Understanding of high-volume payment processing systems and their reliability requirements is a plus.
- Excellent problem-solving, communication, and collaboration skills with the ability to work effectively across teams.
PushPress is dedicated to fostering an inclusive and dynamic workplace. We’re all about leveling up, and that means we don’t tolerate any form of discrimination or harassment. We’re committed to provide equal opportunities, regardless of race, color, religion, sex, sexual orientation, gender identity or expression, pregnancy, age, national origin, disability, genetic info, veteran status, or any other legally protected characteristic.
At PushPress, we’re dedicated to helping both our technology and our team reach peak performance. Whether it’s with your proactive approach, eye for detail, or drive to make a meaningful impact, we’d love to hear from you. At PushPress, we’re all about pushing boundaries and achieving new personal bests—come join us and be part of our fitness-tech journey!
Top Skills
Argocd
AWS
Datadog
Docker
Github Actions
Go
JavaScript
Kubernetes
Prometheus
Python
Terraform
Similar Jobs
Artificial Intelligence • Fintech • Information Technology • Logistics • Payments • Business Intelligence • Generative AI
As a Lead Site Reliability Engineer, you'll manage data pipelines, AWS infrastructure, collaborate with ML teams, and troubleshoot complex issues while enabling observability and support for AI-driven features.
Top Skills:
AksAnsibleSparkAWSAzureAzure MlBashBedrockChefCi/CdDockerEcsEksEmrGitJenkinsLinuxLookerMySQLPythonRedshiftS3SagemakerTerraform
Fintech • Machine Learning • Payments • Software • Financial Services
Lead diverse technology projects in a fast-paced environment while improving performance and reliability of services using distributed microservices. Collaborate on cloud-based solutions and mentor other engineers.
Top Skills:
AWSCassandraDockerKafkaNode.jsOpensearchPostgres
Sales • Software • Automation
As a Site Reliability Engineer, you'll maintain and enhance infrastructure systems, manage databases, ensure system stability, and automate processes using various DevOps tools and technologies.
Top Skills:
AnsibleAWSDockerElasticsearchFlaskKubernetesMongoDBPostgresPythonRedisTerraform
What you need to know about the Charlotte Tech Scene
Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.
Key Facts About Charlotte Tech
- Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
- Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
- Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
- Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
- Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus