ONE (one.app)

Site Reliability Engineer

Reposted 3 Days Ago

Remote

Hiring Remotely in United States

170K-210K Annually

Expert/Leader

Remote

Hiring Remotely in United States

170K-210K Annually

Expert/Leader

The Site Reliability Engineer will ensure service reliability and availability, support incidents, mentor engineers, and improve infrastructure and processes.

The summary above was generated by AI

About OnePay

OnePay is a consumer financial services app with an exceedingly simple mission: to help people achieve financial progress.

Tens of millions of Americans today are unbanked or underbanked, meaning they don’t have enough money in savings to cover a minor emergency. They pay too much in fees, don’t have access to credit at affordable rates, and have little ability to grow their wealth. OnePay’s vision is to create a single app for consumers to save, spend, borrow, and grow their money, bringing our mission to life with simple and accessible banking, credit, and payments products that deliver a best-in-class experience to millions of customers. Our products include:

Checking and high-yield savings accounts
Domestic and international peer-to-peer payments
Credit Builder and credit score monitoring
Digital wallet / contactless payment solutions
Buy-now-pay-later installment loans at Walmart

Why do we have a right to win? We have the backing of Walmart (a Fortune 1) and Ribbit Capital (a preeminent fintech investor), are deeply embedded with the distribution of the world’s largest omnichannel retailer, and have an industry-leading multi-product value proposition — all in addition to having some of the best people and talent in the industry.

There’s never been a better time to build a category-defining business and there has rarely been a team better positioned for the opportunity. Join us!

The Role

As a Site Reliability Engineer (SRE) at OnePay, your mandate is to ensure the availability and reliability of our most critical services, and ensure that they meet the requirements of our customers. Our SRE team at OnePay is growing, so you’ll be a crucial early member to help establish the team, processes, and best practices. Success in this role looks like collaborating with other teams to build and run sustainable production systems that can evolve and adapt to the changes in our fast-paced environment.

This role is responsible for:

Working proactively with engineering teams to help them set SLOs and implement best practices for logging and telemetry collection.
Design, implement and maintain the tools and systems that support service reliability, monitoring, and alerting.
Participating in a 12x7 on-call rotation supporting the health of our services.
Driving the incident management process and support a blameless post-mortem culture
Participating in application design consulting and capacity planning.
Defining and formalizing SRE practices and help guide the overall reliability engineering direction.
Providing mentorship both formally and informally to engineers at OnePay.
Continuously optimizing systems and workflows by improving architecture, infrastructure, automation, CI/CD, and observability.
Combining software and systems knowledge to engineer high-volume distributed systems in a reliable, scalable, and fault-tolerant manner.

You Bring

10+ years of relevant industry experience with a focus on distributed cloud native systems design, observability, operation, maintenance, and troubleshooting.
5+ years operational experience with an observability platform like Datadog, Splunk, Prometheus/Grafana, or AppDynamics.
Fluency in one or more programming languages (e.g. Python, Typescript, Go).
A strong conviction in software development best practices, including version control, automated testing, and continuous integration and delivery.
You're self-motivated, inquisitive, and always looking to learn new technologies.
You’re a great teammate who communicates clearly and transparently.
The Triple H Factor: Humble, Hungry and Honest.
An act-like-an-owner mentality. We have a bias toward taking action.

What We Offer

Competitive base salary, stock options, and health benefits from Day 1
401(k) plan with company match
Remote-friendly (US), flexible time off (FTO), and opportunities for growth
A high-growth, mission-driven, inclusive culture where your work has real impact

Standard Interview Process

Initial Interview with Talent Partner
Technical or Hiring Manager Interview
Team Interview
Executive Interview
Offer!

Equal Employment Opportunity

To build technology and products that are used and loved by people and solve real-world problems, we need to build a team with many different perspectives and experiences. We are an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We encourage candidates from all backgrounds to apply. Applicants in need of special assistance or accommodation during the interview process or in accessing our website may contact us at [email protected].

Top Skills

Appdynamics

Datadog

Grafana

Prometheus

Python

Splunk

Typescript

Similar Jobs

Anduril

Site Reliability Engineer

2 Days Ago

In-Office or Remote

Costa Mesa, CA, USA

105K-196K Annually

Mid level

105K-196K Annually

Mid level

Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense

Deploy and maintain Anduril's hardware and software for military capabilities, troubleshoot installation issues, and ensure system supportability.

Top Skills: C++GoPythonRust

Nexthink

Site Reliability Engineer

3 Days Ago

Remote or Hybrid

Boston, MA, USA

Senior level

Artificial Intelligence • Big Data • Information Technology • Software

Lead Site Reliability Engineer responsible for high-performance cloud platform management, driving SRE processes, team leadership, and ensuring FedRAMP compliance.

Top Skills: AnsibleAWSAzureBashCloudFormationCrossplaneDockerGCPGitGitlabGoJenkinsKubernetesPythonTerraform

ServiceNow

Machine Learning Engineer

5 Days Ago

Remote or Hybrid

Santa Clara, CA, USA

198K-346K Annually

Senior level

198K-346K Annually

Senior level

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation

The role involves designing and implementing infrastructure for AI workloads, improving Site Reliability Engineering practices, coding, mentoring, and collaborating with teams on AI projects.

Top Skills: AnsibleGitlab CiGoHelmJ2EeJavaKubernetesNvidia GpusPrometheusPythonSplunk

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus