Abnormal Security

Senior Software Engineer - Full-Stack - Platform & Infrastructure (Site Reliability)

Reposted 17 Days Ago

Remote

Hiring Remotely in USA

176K-207K Annually

Senior level

Remote

Hiring Remotely in USA

176K-207K Annually

Senior level

Lead initiatives for reliability and operational excellence, mentor engineers, and define goals to improve system reliability and productivity.

The summary above was generated by AI

About the Role

Abnormal Security is looking for a Senior Software Engineer - Site Reliability to join our Infrastructure team. In this role, you will be responsible for the reliability, scalability, and operational excellence of our systems and services. You will lead initiatives to improve the operational maturity of both SRE-managed services and critical product systems, driving change across the organization in support of stable operations.

As a senior member of the team, you will independently define and execute quarterly goals, create forward-looking roadmaps, and own cross-functional projects aligned with company-level objectives. You will serve as a key advocate for reliability, providing technical leadership, deep analysis, and mentorship while embedding with product teams as needed to improve service ownership and incident response practices.

The ideal candidate:

Has strong technical depth in distributed systems and operational excellence
Possesses a product-focused mindset with the ability to translate business needs into reliability goals
Is a strong communicator and mentor, able to influence both within the SRE team and across engineering
Has demonstrated experience leading broad technical initiatives across teams and systems

What You Will Do

Own the operational maturity of services in the SRE software stack, driving architectural and tooling improvements
Proactively partner with product teams to embed SRE best practices and support services with operational challenges
Independently define and drive quarterly goals for the SRE team with measurable impact on system reliability and developer productivity
Design and maintain systems that promote observability, automated recovery, scalability, and resilience
Lead incident reviews and root cause analyses; ensure follow-up actions are implemented and shared across teams
Collaborate with engineering leadership to shape the team roadmap and contribute to company-wide reliability goals
Mentor other engineers and drive adoption of SRE principles throughout the engineering organization

Must Have

8+ years of experience in infrastructure, DevOps, or Site Reliability Engineering roles
Deep knowledge of production-grade distributed systems and cloud-native architectures
Demonstrated experience managing service availability, latency, and incident response in production environments
Strong programming skills in Python, Go, or similar languages
Experience with Kubernetes, Terraform, and observability tools (e.g., Prometheus, Grafana, Datadog)
Proven ability to lead complex, multi-team initiatives and influence system design for reliability

Nice To Have

Prior experience embedding with product engineering teams to support operational goals
Familiarity with AWS and multi-cloud environments (e.g., Azure, GCP)
Experience in regulated environments or with FedRAMP-compliant systems
Contributions to open-source SRE tooling or community knowledge sharing

#LI-NT1

At Abnormal AI, certain roles are eligible for a bonus, restricted stock units (RSUs), and benefits. Individual compensation packages are based on factors unique to each candidate, including their skills, experience, qualifications and other job-related reasons.

Base pay range:

$176,000—$207,050 USD

San Francisco/New York Base pay range:

$195,000—$230,000 USD

Abnormal AI is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, protected veteran status or other characteristics protected by law. For our EEO policy statement please click here. If you would like more information on your EEO rights under the law, please click here.

Top Skills

AWS

Azure

Datadog

GCP

Grafana

Kubernetes

Prometheus

Python

Terraform

Similar Jobs

WorkWhile

Operations Manager

30 Seconds Ago

Remote

Mid level

Artificial Intelligence • HR Tech • Machine Learning • Software • App development

Manage marketplace operations, optimize supply and demand, collaborate with engineering and marketing, and develop metrics for success.

Top Skills: RetoolSQLZapier

WorkWhile

Lead Product Designer

31 Seconds Ago

Remote

Senior level

Artificial Intelligence • HR Tech • Machine Learning • Software • App development

Lead end-to-end design for high-impact initiatives, ensure design excellence, collaborate with engineers, develop user-centered experiences, and mentor the design team.

Top Skills: Design SystemsFigma

WorkWhile

Software Engineer

31 Seconds Ago

Remote

Internship

Artificial Intelligence • HR Tech • Machine Learning • Software • App development

As a Software Engineering Intern, you will collaborate with a team to develop and improve features across the platform, gaining hands-on experience in both frontend and backend development while learning the software development lifecycle.

Top Skills: JavaScriptPythonReactReact Native

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus