Daxko

Site Reliability Engineering Manager

Posted Yesterday

Be an Early Applicant

Remote

Hiring Remotely in Birmingham, AL

163K-211K Annually

Mid level

Remote

Hiring Remotely in Birmingham, AL

163K-211K Annually

Mid level

As a Site Reliability Engineering Manager, you'll manage production assets, lead team efforts on performance, uptime, compliance, and customer service, while overseeing budget responsibilities and team training.

The summary above was generated by AI

Company Description

Daxko powers wellness to improve lives. Every day our team members focus their passion and expertise in helping health & wellness facilities operate efficiently and engage their members.

Whether a neighborhood yoga studio, a national franchise with locations in every city, a YMCA or JCC--and every type of organization in between--we build solutions that make every aspect of running and being a member of a health and wellness organization easier and delightful.

Job Description

As a Site Reliability Engineering Manager, you will manage all production assets for each product. Your responsibilities include: batching, upgrading, deploying new servers, organizing the team's workload, supporting engineering efforts, compliance, uptime, and performance monitoring. You’ll be responsible for prioritizing, organizing, and leading your team's execution of all work. You'll assess operational capabilities and performance to ensure the on-time delivery of quality products and services to all customers, both internal and external.

As a leader, you will:

Set and help the team understand performance targets and goals
Evaluate and provide real-time feedback on performance
Train and/or ensure that the team is properly trained for their specific roles
Coordinate on-call rotation
Coordinate training for staff
Assist in resolving emergencies, such as infrastructure or software outages
Manage headcount and make staffing decisions related to new hires and terminations

In your day-to-day, you will:

Oversee progress in achieving operational/production goals and objectives, especially with respect to quality, cost, and customer service.
Take responsibility for uptime, data accuracy, and integrity.
Interact with Engineering Leads to ensure alignment between teams
Maintain business continuity for all production assets
Ensure proper planning and prioritization using agile practices.
Ensure operations are in full compliance with all company and regulatory requirements.
Be a technical escalation point for your team.
Provide weekly reports on system availability, response, and capacity.
Manage on-call rotation among team members.
Have budget responsibilities, including ensuring fiscal responsibility for hosting and software licensing.

Qualifications

Bachelor’s degree - technical discipline preferred; OR equivalent experience
Three (3) to five (5) years of experience managing globally distributed team members
Three (3) to five (5) years of experience in a site reliability engineering capacity
Solid foundation in the following technologies:
- Linux
- Web Servers (NGiNX / PHP / Traefik / F5)
- Virtualization Technologies (VMWare)
- Cloud Platforms (AWS, Azure)
- Containerization Systems (Docker, Kubernetes, Dynos)
- Caching technology (Redis / rabbitmq )
Strong security mindset and experience implementing security controls
Excellent organizational skills and attention to detail.
Excellent time management skills with a proven ability to meet deadlines.
Strong analytical and problem-solving skills.
Strong supervisory and leadership skills.
Ability to prioritize tasks and to delegate them when appropriate.

Bonus points for:

Strong observability experience with Monitoring Technologies, creating custom checks, and managing alert profiles and escalation policies. (OpenTelemetry, Instana, LogicMonitor, PagerDuty, OpsGenie)
Experience with Tooling (GitLab CI, Jenkins, Chef, Terraform, Elastic Search, Kubernetes, Rancher)
Scripting experience with the following languages: Ruby, Python, Bash
Experience with SOC, PCI, GDPR standards and regulations
Experience working tickets and managing priorities within issue tracking systems (Atlassian Suite, etc.)
Experience developing or supporting Java, php, or node applications
Experience automating repetitive tasks

Additional Information

The salary range for this role is $163,000 - $211,000 per year. Where you fall within the compensation range is based on how you demonstrate the attributes and competencies required for the role. We mostly reserve the upper half of our compensation bands for internal growth. In addition to base salary, we offer a comprehensive benefits package, performance-based incentives, and opportunities for growth.

#LI-Remote

Daxko is dedicated to pursuing and hiring a diverse workforce. We are committed to diversity in the broadest sense, including thought and perspective, age, ability, nationality, ethnicity, orientation, and gender. The skills, perspectives, ideas, and experiences of all of our team members contribute to the vitality and success of our purpose and values.

We truly care for our team members, and this is reflected through our offices, and benefits, and great perks. These perks are only for our full-time team members. Some of our favorites include:

🏝 Flexible paid time off
⚕️ Affordable health, dental, and vision insurance options
💪 Monthly fitness reimbursement
🤑 401(k) matching
🍼 New-Parent Paid Leave
👖 Casual work environments
🏡 Remote work

All your information will be kept confidential according to EEO guidelines.

Top Skills

AWS

Azure

Bash

Chef

Docker

Dynos

Elastic Search

Gitlab Ci

Instana

Java

Jenkins

Kubernetes

Linux

Logicmonitor

Nginx

Node.js

Opentelemetry

Opsgenie

Pagerduty

PHP

Python

RabbitMQ

Redis

Ruby

Terraform

Traefik

VMware

Similar Jobs

Aledade

Senior Engineering Manager- Site Reliability

14 Days Ago

Remote

United States

Senior level

Healthtech

As a Senior Engineering Manager for Site Reliability, you'll lead an engineering team, drive technical solutions, and ensure system reliability and performance.

Top Skills: AWSDockerKubernetesLinuxPython

CoreLogic

Manager, Site Reliability Engineering

17 Days Ago

Remote

111K-160K

Mid level

111K-160K

Mid level

Real Estate • PropTech

Lead a team of Site Reliability Engineers to ensure system reliability, manage incidents, and implement performance optimization strategies.

Top Skills: GoJavaPython

Ferguson Enterprises

IT Manager - Application Administration / Site Reliability Engineering

19 Days Ago

Remote

USA

8K-13K

Senior level

8K-13K

Senior level

Other • Retail

The IT Manager will lead a team of Application Administrators and SREs, managing application maintenance, performance optimization, incident resolution, and team development.

Top Skills: AnsibleAzureBashHTMLIisJavaJbossOracleOracle LinuxPowershellRedhatSQLWindows Server

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus