Cribl

Principal Site Reliability Engineer

Reposted 10 Days Ago

Remote

Hiring Remotely in United States

240K-400K

Expert/Leader

Remote

Hiring Remotely in United States

240K-400K

Expert/Leader

The Principal Site Reliability Engineer will enhance observability systems, monitor production health, drive improvements, and automate processes in a collaborative, remote environment.

The summary above was generated by AI

Cribl does differently.

What does that mean? It means we are a serious company that doesn’t take itself too seriously; and we’re looking for people who love to get stuff done, and laugh a bit along the way. We’re growing rapidly - looking for collaborative, curious, and motivated team members who are passionate about putting customers first. As a remote-first company we believe in empowering our employees to do their best work, wherever they are.

As the data engine for IT and Security many of the biggest names in the most demanding industries trust Cribl to solve their most pressing data needs. Ready to do the best work of your career? Join the herd and unlock your opportunity.

Why You’ll Love This Role

Cribl Inc is seeking a Principal Site Reliability Engineer to join our mission to unlock the value of all observability data. Cribl provides users a new level of observability, intelligence and control over their real-time data. You will join a team of software engineers and SREs who are committed to shipping only high-quality software and enjoying all the goat gifs the internet has to offer. This role is remote and you will be part of the engineering organization where you will contribute in our efforts to envision, create, deploy, test, and ship Cribl products.

We are looking for Site Reliability Engineers at all levels at Cribl, who enjoy being in the thick of it. Our SREs are involved from conception, design, development and testing, all the way through to production

If reliability is your passion, if you have always had strong opinions on how to make things better and if you have the desire to build consensus around ideas, let's talk!

As An Active Member Of Our Team, You Will...

Chart the future of Cribl’s observability and reliability systems and practices
Conceptualize and direct the evolution of our reliability metrics, programs and process based on the state of the art and industry best practices
Engage with Product and Engineering teams to improve service delivery and reliability across the entire software lifecycle
Measure and monitor all production systems with an eye towards availability, latency and overall system health
Uncover risks and seek out the sources of errors and instability in our production systems.
Advocate engineering-wide improvements in reliability, observability and promote antifragility
Identify and drive down toil with creative innovation and automation
Participate in on-call

If You Got It - We Want It

Extensive experience with enterprise scale continuous delivery environments
10+ years of experience in a DevOps or SRE role
Development with JavaScript/Node.js/TypeScript in a Linux/Mac environment
Experience with IaC tools like Terraform (preferred) or similar
Experience with sustainable incident response in a blameless environment
Knowledge of cloud platforms (prefer AWS) and container + orchestration technologies
Experience with APM and Observability and related tools such as, New Relic, Splunk, CloudWatch, Prometheus, Grafana/Kibana, Sentry etc.
Deep understanding of SRE practices, such as SLOs, Error Budgets, PRRs, Problem Management
Comfortable with a high level of autonomy and working with a distributed team
Knowledge of Cloud and application security
Strong knowledge of cloud design patterns for scale, data management, resiliency, etc.
Passion for software quality and craft

Salary Range ($240,000 - $400,000)

The salary for this role is dependent on geographic location. The salary offered within the range described will be based on the individual candidate’s job-related knowledge, skills, and experience. In addition to a competitive salary, Cribl also offers a generous benefits package which includes health, dental, vision, short-term disability, and life insurance, paid holidays and paid time off, a fertility treatment benefit, 401(k), equity, and eligibility for a discretionary company-wide bonus.

#LI-EL1

#Remote

Bring Your Whole Self
Diversity drives innovation, enables better decisions to support our customers, and inspires change for the better. We’re building a culture where differences are valued and welcomed, and we work together to bring out the best in each other. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which the candidate is applying.

Interested in joining the Cribl herd? Learn more about the smartest, funniest, most passionate goats you’ll ever meet at cribl.io/about-us.

Top Skills

AWS

Cloudwatch

Grafana

JavaScript

Kibana

New Relic

Node.js

Prometheus

Sentry

Splunk

Terraform

Typescript

Similar Jobs

Bright Horizons

Site Reliability Engineer

5 Days Ago

Remote or Hybrid

02459, Newton Center, MA, USA

150K-160K

Expert/Leader

150K-160K

Expert/Leader

Cloud • Edtech • Kids + Family • Database

The Principal SRE ensures reliable operation of digital infrastructure by enhancing performance, automation, and incident management, collaborating across teams.

Top Skills: AnsibleAutomationBashCloud TechnologiesDatadogDevOpsDistributed SystemsDynatraceNew RelicObservabilityPowershellPythonScriptingSoftware EngineeringTerraform

MongoDB

Senior Site Reliability Engineer

7 Days Ago

Easy Apply

Remote or Hybrid

United States

Easy Apply

127K-249K Annually

Senior level

127K-249K Annually

Senior level

Big Data • Cloud • Software • Database

The role involves maintaining and improving CI/CD infrastructure using Argo Workflows and Kubernetes, ensuring effective deployment for engineering teams.

Top Skills: AWSAzureGoGCPKubernetesPython

SailPoint

Site Reliability Engineer

8 Days Ago

Remote or Hybrid

United States

176K-328K Annually

Senior level

176K-328K Annually

Senior level

Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy

As a Principal SRE, drive reliability practices for the Identity Security Cloud platform, mentor teams, and improve service reliability and performance.

Top Skills: AWSGoGrafanaHoneycombJavaKibanaKubernetesPrometheusPythonTerraform

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus