Socure

Staff Software Engineer-SRE

Reposted 20 Days Ago

Remote

Hiring Remotely in USA

180K-215K Annually

Senior level

Remote

Hiring Remotely in USA

180K-215K Annually

Senior level

Lead the architecture and development of Entity Resolution APIs; design batch and streaming data pipelines; collaborate on integrated machine learning models and high-performance distributed systems.

The summary above was generated by AI

Why Socure?

At Socure, we’re on a mission—to verify 100% of good identities in real time and eliminate identity fraud from the internet.

Using predictive analytics and advanced machine learning trained on billions of signals to power RiskOS™, Socure has created the most accurate identity verification and fraud prevention platform in the world. Trusted by thousands of leading organizations—from top banks and fintechs to government agencies—we solve real, high-impact problems at scale. Come join us!

Job Overview

We are looking for a Site Reliability Engineer (SRE) who will be supporting our Identity Graph initiatives.

Identity Graph Intelligence at Socure builds and maintains the core layer that connects and resolves identities across billions of data points. Our work powers Socure’s industry-leading identity verification and fraud prevention solutions by creating a unified, accurate, and real-time view of individuals. The team focuses on scalability, reliability, and advanced data engineering to support mission-critical applications for our customers

At Socure, you’ll join a high-performing engineering team dedicated to driving the reliability, scalability, and performance of our systems. You will collaborate cross-functionally with software engineers, technical support, and security teams to build and maintain robust, automated, and resilient infrastructure powering our critical applications. SREs play an essential role in architectural decision-making, incident response, and promoting a culture of continuous improvement, automation, and operational excellence.

What you'll do:

Design, build, and maintain scalable infrastructure to support high availability and performance.
Develop tools and automation to eliminate manual operations and increase system reliability.
Monitor production systems, respond to incidents, conduct root cause analyses, and lead post-mortem reviews.
Collaborate with development and platform teams to implement best practices for deployment, observability, and reliability.
Drive incident management and participate in an on-call rotation to ensure 24/7 availability for mission-critical platforms.
Establish and improve SLAs, SLOs, and SLIs to track and enhance system reliability and performance.
Champion a culture of continuous improvement, resilience, and automation across engineering and operations.
Build and maintain CI/CD pipelines and infrastructure-as-code to streamline deployments and accelerate development cycles.
Develop monitoring dashboards and alerts using tools such as Prometheus, Grafana, Datadog, or Splunk.
Support security and compliance efforts by implementing infrastructure hardening and best practices aligned with frameworks (SOC 2, ISO 27001).
Mentor junior engineers and act as a technical resource for improving reliability within cross-functional teams.

What you bring:

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
8+ years of experience in software development, site reliability engineering, DevOps, or infrastructure engineering—preferably in high-scale, high-availability environments.
Proficiency in programming/scripting languages (Python, Java, Go, Bash, or Terraform).
Proven hands-on experience building tools and automation for infrastructure and operations.
Deep understanding of microservices architecture, RESTful APIs, and cloud platforms such as AWS
Expertise in Kubernetes, Docker, and container orchestration in production environments.
Experience with observability tools (Prometheus, Grafana, Datadog, ELK stack).
Strong knowledge of distributed systems, performance optimization, and operational excellence.
Experience with SQL and NoSQL databases, caching layers, and troubleshooting complex production issues.
Familiarity with CI/CD pipelines (Jenkins, GitHub Actions, ArgoCD, CircleCI) and infrastructure-as-code (Terraform).
Solid understanding of networking, security, and compliance frameworks.
Excellent communication and collaboration skills; able to work effectively across engineering, operations, and support teams.
Strong problem-solving skills, detail orientation, and a proactive mindset for continual reliability improvement.

Socure is an equal opportunity employer and values diversity of all kinds at our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Follow Us!

YouTube | LinkedIn | X (Twitter) | Facebook

Top Skills

AWS

Elasticsearch

Java

Kafka

Kubernetes

Python

Scala

Spark

Sqs

Vespa

Similar Jobs

Affirm

Staff Software Engineer

Yesterday

Easy Apply

Remote

United States

Easy Apply

200K-275K

Senior level

200K-275K

Senior level

Big Data • Fintech • Mobile • Payments • Financial Services

This role involves setting technical strategies, collaborating across teams, managing operations and availability, and fostering a culture of quality and ownership within the Site Reliability Engineering team.

Top Skills: AWSKotlinKubernetesMySQLPythonSpark

Rula

Staff Software Engineer

5 Days Ago

In-Office or Remote

184K-217K

Senior level

184K-217K

Senior level

Healthtech • Other • Social Impact • Software • Telehealth

The Staff SRE & DevOps Engineer ensures system reliability and efficiency, collaborates with teams, and applies SRE best practices in a remote work environment.

Top Skills: AWSKubernetes

Optimal Dynamics

Staff Software Engineer

16 Days Ago

Remote

USA

180K-220K Annually

Senior level

180K-220K Annually

Senior level

Artificial Intelligence • Information Technology • Logistics • Machine Learning • Software

Lead reliability initiatives for the production platform, manage incident response, define SLIs/SLOs, and enhance security by embedding it into delivery pipelines. Drive platform improvements in AWS and CI/CD processes.

Top Skills: AuroraAWSBazelCi/CdDagsterDbtDuckdbDynamoDBEcsJavaJavaScriptKubernetesPythonSpaceliftSqsSsmTerraformTrinoTypescript

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus