Razer Logo

Razer

Site Reliability Engineer

Reposted 3 Days Ago
Be an Early Applicant
In-Office or Remote
Hiring Remotely in South
Mid level
In-Office or Remote
Hiring Remotely in South
Mid level
Seeking a Senior Site Reliability Engineer to design and manage AWS infrastructure, implement IaC, enhance reliability, and improve monitoring systems.
The summary above was generated by AI

Joining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is also a great place to work, providing you the unique, gamer-centric #LifeAtRazer experience that will put you in an accelerated growth, both personally and professionally.

Job Responsibilities/ 工作职责 :

Job Description Summary
We are seeking a skilled and driven Site Reliability Engineer (SRE) to join our growing infrastructure and platform engineering team. The ideal candidate will have hands-on experience in Amazon Web Services (AWS), strong troubleshooting capabilities, and a passion for building scalable, observable, and resilient systems using modern Infrastructure as Code (IaC) and automation tools.


Job Description

REQUIREMENTS:

  • Bachelor’s degree in Computer Science, Software Engineering, Information Technology, or a related field.

  • Minimum 3 years of experience in SRE, DevOps, cloud infrastructure, or system administration roles.

  • Hands-on expertise with AWS Cloud Services, including:

  • Compute & Containerization: EC2, Lambda, ECS, EKS, Auto Scaling

  • Networking: Load Balancers, VPC, Route 53, Security Groups, Firewalls

  • Storage & Databases: RDS, ElastiCache, Athena, S3

  • Messaging: SQS, SES

  • Deep understanding of Infrastructure as Code (IaC) tools such as Terraform and CloudFormation.

  • Proficiency in at least one programming/scripting language: Python, Node.js, Bash, Ruby, or related.

  • Experience operating and troubleshooting across Linux, Windows, and container-based environments.

  • Strong understanding of distributed systems, cloud networking (routers, switches), firewalls, DNS, and HTTP/TLS.

  • Experience implementing monitoring and alerting systems and working with incident management processes.

  • Experience with Zero Downtime Deployments, blue/green or canary deployments.

  • Familiarity with cost optimization and right-sizing AWS resources.

  • Exposure to multi-region, multi-account AWS architecture.

  • Understanding of API gateway, or edge networking (e.g., Akamai, CloudFront).

JOB DESCRIPTION:

  • Design, develop, and maintain Infrastructure as Code (IaC) using tools like Terraform or AWS CloudFormation, leveraging AI coding assistants to accelerate development and enforce best practices.

  • Implement and operate reliable, scalable cloud infrastructure primarily on AWS (e.g., EC2, ECS, RDS, S3, Lambda, ElastiCache, SQS, SES, Auto Scaling, Load Balancers)

  • Lead and participate in architecture reviews focusing on reliability, scalability, security, performance, and the cost-efficiency of infrastructure.

  • Develop and manage robust monitoring, alerting, and logging solutions (e.g., CloudWatch, Prometheus, Grafana, ELK), incorporating AIOps tools for predictive alerting, anomaly detection, and reducing alert fatigue.

  • Perform incident management, postmortems, root cause analysis, and implement continuous improvement strategies, utilizing AI-driven analytics to rapidly summarize logs and traces during outages.

  • Collaborate with software engineering teams to improve CI/CD pipelines, deployment automation, release management, and the deployment lifecycles of machine learning models.

  • Automate infrastructure operations, reduce manual toil, and improve reliability using scripting (Python, Bash, Node.js, or Ruby) and AI-powered workflow automation.

  • Maintain and troubleshoot environments involving web servers, databases, firewalls, DNS, load balancers, networking.

  • Ensure systems are compliant with security standards, including patching, hardening, secure access policies, and data privacy constraints specific to AI training data.

  • Provide on-call support, participate in incident rotations.

  • Monitor and maintain service-level objectives (SLOs), SLAs, and error budgets to ensure reliability targets are met.

  • Provide support and solution handling to incidents and tickets assigned.

Pre-Requisites/ 任职要求 :

Razer is proud to be an Equal Opportunity Employer. We believe that diverse teams drive better ideas, better products, and a stronger culture. We are committed to providing an inclusive, respectful, and fair workplace for every employee across all the countries we operate in. We do not discriminate on the basis of race, ethnicity, colour, nationality, ancestry, religion, age, sex, sexual orientation, gender identity or expression, disability, marital status, or any other characteristic protected under local laws. Where needed, we provide reasonable accommodations - including for disability or religious practices - to ensure every team member can perform and contribute at their best.

Are you game?

Similar Jobs

16 Days Ago
In-Office or Remote
Senior level
Senior level
Gaming • Hardware
The Senior Site Reliability Engineer will design and maintain Infrastructure as Code solutions, enhance cloud infrastructure, lead incident responses, and mentor junior engineers.
Top Skills: Amazon Web Services (Aws)BashCloudFormationCloudwatchDatadogDockerElkLinuxNode.jsPythonRubyTerraformWindows
11 Hours Ago
Remote
Senior level
Senior level
Gaming • Hardware
The Senior IT Risk & Compliance Specialist manages IT and cyber risks, ensuring regulatory compliance and implementing ISO/IEC 27001 across the organization.
Top Skills: AWSAzureGdprIso/Iec 27001Mas TrmRmitSQL
Entry level
Food • Retail
Provide friendly, accurate checkout service including processing payments, coupons, and produce codes; bag orders; maintain cash security; stand and lift up to 25 lbs; work evenings and weekends. Training provided and other store duties as assigned.

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account