Cloudbeds

Site Reliability Engineer

Reposted 5 Days Ago

Be an Early Applicant

Remote

28 Locations

Junior

Remote

28 Locations

Junior

As a Site Reliability Engineer, you will ensure system reliability, collaborate on scalable solutions, manage Kubernetes clusters, and improve monitoring systems in a remote team.

The summary above was generated by AI

What Makes Us Unique

At Cloudbeds, we're not just building software, we’re transforming hospitality. Our intelligently designed platform powers properties across 150 countries, processing billions in bookings annually. From independent properties to hotel groups, we help hoteliers transform operations and uplevel their commercial strategy through a unified platform that integrates with hundreds of partners. And we do it with a completely remote team. Imagine working alongside global innovators to build AI-powered solutions that solve hoteliers' biggest challenges. Since our founding in 2012, we've become the World's Best Hotel PMS Solutions Provider and landed on Deloitte's Technology Fast 500 again in 2024 – but we're just getting started.

We are seeking a talented and motivated Site Reliability Engineer (SRE) to join our growing team. As an SRE at Cloudbeds, you will be responsible for ensuring the reliability, availability, and performance of our systems and applications. You will collaborate with cross-functional teams to design and implement scalable and resilient solutions, leveraging automation and best practices in site reliability engineering. You will have endless opportunities for architecture design and implementation within AWS cloud infrastructure in a largely bottom-up and healthy debate team culture. As an SRE Engineer, you will help us in providing the highest quality full-stack management solution for hotels, B&B’s, hostels, and vacation rentals all over the world.

Location: Serbia

What You Will Do:

Design and implement reliable, scalable, and efficient systems to meet the needs of the organization.
Maintain and support highly loaded Kubernetes (EKS) clusters and infrastructure-related components.
Develop and continuously improve Product monitoring and logging systems based on the Prometheus, DataDog, and Loki stacks.
Respond to and resolve incidents, ensuring minimal impact on services.
Collaborate with development teams to establish Service Level Objectives (SLOs) and ensure systems meet or exceed reliability targets. Optimize system performance and troubleshoot issues as they arise.
Support development teams by sharing SRE best practices and expertise, assist in environment and application configuration from the resiliency perspective.
Collaborate with security teams to implement and maintain security best practices.
Support the release process via CI/CD pipelines.
Automate the platform with infrastructure-as-code and configuration management.
Maintain clear and comprehensive documentation for systems, processes, and procedures. Share knowledge with team members to enhance overall understanding.
On-call rotation support for the production environment outages.

You’ll Succeed With:

2+ years of experience as a DevOps or SRE Engineer, working with AWS.
Exceptional skills in Linux system administration.
2+ years of strong Experience in Kubernetes, Docker, Helm charts.
Experience implementing and scaling Elastic Kubernetes (EKS) platforms.
Strong Experience with application containerization methodologies and delivery.
Strong Experience with monitoring, logging, and alerting technologies (any of ELK, Datadog, Loki, AWS Cloudwatch).
Experience with infrastructure-as-code methodologies such as Terraform.
Experience with designing, building, and supporting CI/CD pipelines (Github Actions, Bitbucket pipelines, and ArgoCD).
Experience with web application servers (NGiNX, Ingress controllers, traffic load balancing), databases (MySQL, PostgreSQL, Aurora), cache technologies (any of Redis, Memcached), and queue technologies (SQS).
Ability to write Bash/Python scripts.
Good networking skills.
Good written and verbal communication in English.
Good team player qualities.
Ability to work remotely and manage your own time in a global team.
Bachelor’s degree in Computer Science or related field, or equivalent experience.

Nice to Haves:

Advanced experience with Database Administration (Aurora, MySQL, PostgreSQL).
Experience working in a Scrum team using Jira and as L3/L4 support.
Experience working in a PCI-compliant environment.
Experience working with Kong API Gateway

#LI-IK1

What to Expect - Your Journey with Us

Behind Cloudbeds' revolutionary technology is a team of redefining what's possible in hospitality. We're 650+ employees across 40+ countries, bringing together elite engineers, AI architects, world-class designers, and hospitality veterans to solve challenges others haven't dared to tackle. Our diverse team speaks 30+ languages, but we all share one language: a passion for innovation and travel. From pioneering breakthroughs in machine learning to revolutionizing how hotels operate, we're not just watching the future of hospitality unfold – we're coding it, designing it, writing it and shipping it. If you're ready to work alongside some of the brightest minds in tech who are obsessed with using AI to transform a trillion-dollar industry, this is your chance to be part of something extraordinary.

Learn more online at cloudbeds.com

Company Awards to Check Out!

Best All-In-One Hotel Management System | HotelTechAwards (2025)
Overall 10 Best Places to Work | HotelTechAwards (2025)
Most Loved Workplace® Certified (2024)
Top 10 People’s Choice(2024)
Deloitte Technology Fast 500 (2024)

Discover our Benefits:

Remote First, Remote Always
PTO in accordance with local labor requirements
2 corporate apartment accommodations for team member use for free (San Diego & São Paulo)
Full Paid Parental Leave
Home office stipend based on country of residency
Professional development courses in Cloudbeds University
Access provided to professional Therapy and Coaching
Access to professional development, including manager training, upskilling and knowledge transfer.

Everyone is Welcome - A Culutre of Inclusion

Cloudbeds is proud to be an Equal Opportunity Employer that celebrates the diversity in our global team! We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

Cloudbeds is committed to the full inclusion of all qualified individuals. As part of this commitment, Cloudbeds will ensure that persons with disabilities are provided reasonable accommodations in the hiring process. We encourage deaf, hard of hearing, deaf-blind, and deaf-disabled individuals to apply. If reasonable accommodation is needed to participate in the job application or interview process or to perform essential job functions, please contact our HR team by phone at 858-201-7832 or via email at [email protected]. Cloudbeds will provide an American Sign Language (ASL) interpreter where needed as a reasonable accommodation for the hiring processes.

To all Staffing and Recruiting Agencies: Our Careers Site is only for individuals seeking a job at Cloudbeds. Staffing, recruiting agencies, and individuals being represented by an agency are not authorized to use this site or to submit applications, and any such submissions will be considered unsolicited. Cloudbeds does not accept unsolicited resumes or applications from agencies. Please do not forward resumes to our jobs alias, Cloudbeds employees, or any other company location. Cloudbeds is not responsible for any fees related to unsolicited resumes/applications.

Top Skills

Argocd

Aurora

AWS

Bash

Bitbucket

Ci/Cd

Datadog

Docker

Github Actions

Helm

Kubernetes

Loki

Memcached

MySQL

Nginx

Postgres

Prometheus

Python

Redis

Sqs

Terraform

Similar Jobs

Ditto Live

Site Reliability Engineer

Yesterday

Remote

145K-145K

Senior level

145K-145K

Senior level

Information Technology • Software

As a Lead Site Reliability Engineer, you'll mentor a team, drive site reliability initiatives, improve incident management, and architect observability solutions.

Top Skills: AWSAzureCDatadogGCPGoGrafanaHelmJavaKubernetesPrometheusRustTerraform

Plum (withplum)

Site Reliability Engineer

Yesterday

In-Office or Remote

Athens, GRC

Senior level

Artificial Intelligence • Fintech • Software • Financial Services

Join Plum as a Senior Site Reliability Engineer to ensure resilient, secure, and scalable systems. Operate infrastructure, automate processes, and optimize CI/CD workflows while collaborating across teams.

Top Skills: Argo WorkflowsArgocdAWSCircleCIGCPGithub ActionsGrafanaKubernetesOpentelemetryPostgresPrometheusPythonRabbitMQRedisTerraform

Ditto Live

Site Reliability Engineer

2 Days Ago

Remote

145K-145K

Senior level

145K-145K

Senior level

Information Technology • Software

The Staff Site Reliability Engineer will ensure reliability and scalability of Ditto's cloud infrastructure, develop observability solutions, and lead incident management efforts, collaborating with product engineering teams.

Top Skills: AWSAzureCDatadogGCPGoGrafanaHelmJavaKubernetesPrometheusRustTerraform

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus