Hewlett Packard Enterprise

SRE Tech Lead

Posted 16 Days Ago

Be an Early Applicant

3 Locations

148K-341K Annually

Senior level

3 Locations

148K-341K Annually

Senior level

Seeking a Senior SRE to ensure system reliability, scalability, and performance. Responsibilities include automation, incident response, and mentorship of junior staff.

The summary above was generated by AI

SRE Tech Lead

This role has been designed as ‘Hybrid’ with an expectation that you will work on average 2 days per week from an HPE office.

Who We Are:

Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE.

Job Description:

About the Role

We are seeking a highly skilled Senior Site Reliability Engineer (SRE) to join our team and drive our technical agenda. You will play a key role in ensuring the reliability, scalability, and performance of our systems and those of our customers. As a Senior SRE, you will be responsible for influencing the design, implementation, and maintenance of robust infrastructure, automating operational tasks, and enhancing system observability. You will work closely with development, operations, and security teams to create resilient, high-performing systems that support business growth.

Key Responsibilities

System Reliability & Performance

Be an advocate for highly available, scalable, and resilient systems in cloud or hybrid environments. Work with development and support teams to improve the implementation to achieve a better customer experience and lower operating costs.

Define and manage Service Level Objectives (SLOs), Service Level Agreements (SLAs), and Service Level Indicators (SLIs) to ensure system reliability.

Proactively identify performance bottlenecks and implement optimizations to improve system efficiency.

Automation & Infrastructure as Code (IaC)

Automate and drive for the automation of repetitive tasks and operational workflows to reduce toil and improve system efficiency.

Incident Response

Audit, verify and improve incident response procedures, including runbooks, and post-incident reviews.

Security & Compliance

Collaborate with security teams to ensure compliance with best practices in cloud security, access control, and vulnerability management.

Collaboration & Leadership

Mentor junior SREs and software engineers, fostering a culture of reliability and operational excellence.

Work closely with development teams to build resilient applications with best-in-class reliability and performance.

Advocate for SRE best practices across the organization, promoting a culture of shared responsibility for system reliability.

Qualifications & Skills

Required:

12+ years of experience in Site Reliability Engineering, DevOps, Infrastructure Engineering, Operations, Software Engineering.

Experience with cloud platforms (AWS, Azure), hypervisors (VMware, KVM) and container orchestration (Kubernetes, Docker).

Proficiency in programming/scripting languages such as Python, Go, Bash.

Hands-on experience with monitoring & logging tools (Prometheus, Grafana, ELK stack, OpsRamp).

Solid understanding of networking, security best practices, and Linux systems administration.

Strong problem-solving skills and ability to troubleshoot complex distributed systems.

Excellent communication skills and ability to work in a collaborative, distributed and multi-cultural team environment.

Preferred:

Experience with distributed systems, microservices architectures, and chaos engineering.

Familiarity with machine learning-based anomaly detection for observability.

Contributions to open-source projects or active participation in the SRE/DevOps community.

Additional Skills:

Cloud Architectures, Cross Domain Knowledge, Design Thinking, Development Fundamentals, DevOps, Distributed Computing, Microservices Fluency, Full Stack Development, Release Management, Security-First Mindset, User Experience (UX)

What We Can Offer You:

Health & Wellbeing

We strive to provide our team members and their loved ones with a comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.

Personal & Professional Development

We also invest in your career because the better you are, the better we all are. We have specific programs catered to helping you reach any career goals you have — whether you want to become a knowledge expert in your field or apply your skills to another division.

Unconditional Inclusion

We are unconditionally inclusive in the way we work and celebrate individual uniqueness. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good.

Let's Stay Connected:

Follow @HPECareers on Instagram to see the latest on people, culture and tech at HPE.

#unitedstates#hybridcloud

Job:

Engineering

Job Level:

TCP_05

States with Pay Range Requirement

The expected salary/wage range for a U.S.-based hire filling this position is provided below. Actual offer may vary from this range based upon geographic location, work experience, education/training, and/or skill level. If this is a sales role, then the listed salary range reflects combined base salary and target-level sales compensation pay. If this is a non-sales role, then the listed salary range reflects base salary only. Variable incentives may also be offered. Information about employee benefits offered can be found at https://myhperewards.com/main/new-hire-enrollment.html.

USD Annual Salary: $148,000.00 - $340,500.00

HPE is an Equal Employment Opportunity/ Veterans/Disabled/LGBT employer. We do not discriminate on the basis of race, gender, or any other protected category, and all decisions we make are made on the basis of qualifications, merit, and business need. Our goal is to be one global team that is representative of our customers, in an inclusive environment where we can continue to innovate and grow together. Please click here: Equal Employment Opportunity.

Hewlett Packard Enterprise is EEO Protected Veteran/ Individual with Disabilities.

HPE will comply with all applicable laws related to employer use of arrest and conviction records, including laws requiring employers to consider for employment qualified applicants with criminal histories.

Top Skills

AWS

Azure

Bash

Docker

Elk Stack

Grafana

Kubernetes

Kvm

Opsramp

Prometheus

Python

VMware

Similar Jobs

Broadridge

Senior Infrastructure Engineer (Hybrid - Flexible Options)

7 Days Ago

120K-130K Annually

Senior level

120K-130K Annually

Senior level

Fintech • Financial Services

The Lead Site Reliability Engineer will manage technical infrastructure for applications, ensuring reliability and performance, and implement automation processes across various platforms.

Top Skills: AnsibleAWSAzureBladelogicChefJenkinsLinuxPerlPowershellShell ScriptsTerraformWindows

Cytracom

Senior Site Reliability Engineer

12 Days Ago

McKinney, TX, USA

Senior level

Software

The Senior Site Reliability Engineer ensures reliable, scalable, and secure operations of Cytracom products. Responsibilities include system and network administration, automation, monitoring, and incident response.

Top Skills: AnsibleGrafanaIdsLinuxPrometheusTerraform

Cloudflare

Customer Solutions Engineer, Zero Trust

2 Hours Ago

Hybrid

125K-175K Annually

Senior level

125K-175K Annually

Senior level

Cloud • Information Technology • Security • Software • Cybersecurity

As a Customer Solutions Engineer, you will advise customers on using the Cloudflare platform, ensuring successful onboarding and ongoing support while addressing technical challenges and identifying opportunities for expansion.

Top Skills: BashDnsHTTPJavaScriptPythonTcp/IpTls

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus