Cordial Logo

Cordial

Senior Site Reliability Engineer

Reposted 9 Days Ago
Remote
Hiring Remotely in US
135K-170K
Senior level
Remote
Hiring Remotely in US
135K-170K
Senior level
As a Senior Site Reliability Engineer at Cordial, you will monitor, develop and scale the platform, ensuring optimal performance and reliability, while collaborating with DevOps and Product teams.
The summary above was generated by AI

ABOUT CORDIAL

We founded Cordial in 2014 on the belief that there should be more humanity and empathy in marketing—both in how brands communicate with their customers and in how technology companies work with brands. We built our company and platform purposefully, driven by a desire to inspire more thoughtful communication and to create experiences that feel more personal and human—for consumers, for the people at the companies we work with, and for Cordial employees. Today, brands like PacSun, Revolve, Abercrombie & Fitch, Realtor.com, L.L. Bean and Forbes rely on Cordial to drive revenue growth by sending a better message.

We chose the name Cordial to symbolize how we empower our clients to communicate with their customers, as well as how we do business: with transparency, collaboration, and trust. We're building a passionate team of individuals willing to learn, grow, and be thoughtfully challenged on a daily basis to continuously improve our product, company, and culture every single day.

OUR VALUES

  • Communicate better than the rest
  • Own it, every time
  • Solve client problems tenaciously 
  • Make Waves

POSITION SUMMARY

We are looking for a motivated and talented Site Reliability Engineer to join us to help us monitor, develop, and scale the Cordial platform. Our goal is to provide our clients with a delightful experience in their day to day interaction with the platform and to create trust that the expected jobs and background processes will run without issue. You will work with our DevOps and Product teams to ensure that bugs are squashed, performance is optimized, and blind spots are revealed through comprehensive monitoring.

YOU WILL

  • Utilize your knowledge of Web, App, Network, Server, Storage and Security technologies to administer, monitor and troubleshoot application and network components in our cloud based environment. (We are AWS hosted and make extensive use of Kubernetes, Consul, and Vault clusters)
  • Help design, author, deploy, and monitor manifests for our multiple Kubernetes clusters, helm charts/repos, and service mesh configurations. 
  • Actively contribute to platform Infrastructure Design and Implementation discussions
  • Use your software engineering skills to trace/debug code and identify root causes of production data corruption and/or performance issues.
  • Provide production support for the Product Development teams
  • Participate in an on-call rotation
  • Work with the team to develop and deploy monitoring and alerting architecture, and implement monitoring/logging solutions
  • Troubleshoot complex issues in a timely manner as necessary to maintain the performance and stability of our Production Application environment
  • Help build out SLOs and document and monitor SLAs

ABOUT YOU

  • 5+ years UNIX/Linux Systems (Unix/Linux) & Network Administration (DNS, IPsec, VPN, Load Balancing, process tracing)
  • Experience with AWS (we use EC2, EKS)
  • Experience deploying and/or maintaining Kubernetes/EKS clusters
  • Hands on experience writing & maintaining custom Helm charts.
  • Experience working with one or more service meshes (app-mesh, Istio, Linkerd)
  • Experience with monitoring, logging and alerting tools
  • Previous positions held as a SRE and/or DevOps role
  • Development experience in PHP
  • Extensive experience with Docker/containers & Kubernetes
  • Experience with Hashicorp products such as Consul and Vault
  • Comfortable working in a globally distributed team across time zones
  • Strong teamwork and communication skills
  • A genuine desire to learn new technologies and grow
  • Fluent in verbal and written English
  • Experience with large-scale distributed systems
  • Proficiency in infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation)
  • Understanding of observability principles and tools (e.g., Prometheus, Grafana, ELK stack, distributed tracing)
  • Familiarity with CI/CD pipelines (e.g., Jenkins, GitLab CI, ArgoCD)
  • A strong grasp of networking fundamentals
  • Security best practices in a cloud environment

COMPENSATION & BENEFITS

$140,000.00-$180,000.00 annually. The compensation range may be adjusted based on experience and location. In combination with base salary, Cordial's compensation package includes equity and bonus, a robust benefit plan (medical/dental/vision/life), 401k match, flexible time off. Additionally, we offer perks such as monthly wellness and cell phone stipends, childcare and continued education yearly reimbursements. We pride ourselves in maintaining a healthy work/life balance, a strong dedication to DE&I efforts, and an overall respectful and open culture!

Cordial is proud to be an equal opportunity employer that celebrates diversity and is committed to creating an inclusive workplace with equal opportunity for all applicants. Our goal is to recruit the most talented people from a diverse candidate pool regardless of race, color, ancestry, national origin, religion, disability status, sex (including pregnancy), age, gender, gender identity or expression, sexual orientation, marital status, veteran status, or any other characteristic protected by law.

Cordial is committed to working with and providing access and reasonable accommodation to applicants with mental and/or physical disabilities. If you require an accommodation, please reach out to your recruiter once you've begun the interview process. All requests for accommodations are treated discreetly and confidentially, as practical and permitted by law.

Top Skills

Argocd
AWS
CloudFormation
Consul
Docker
Elk Stack
Gitlab Ci
Grafana
Helm
Jenkins
Kubernetes
PHP
Prometheus
Terraform
Vault

Similar Jobs

2 Days Ago
In-Office or Remote
Atlanta, GA, USA
160K-185K
Senior level
160K-185K
Senior level
Fintech • Gaming • Mobile • Sports • Esports
The Senior Site Reliability Engineer will design and maintain cloud infrastructure, automate processes, monitor system performance, and mentor junior team members for enhanced reliability and scalability.
Top Skills: AnsibleAWSAzureCloudFormationDockerGCPGoJavaJenkinsKubernetesPythonTerraform
9 Days Ago
Remote
USA
134K-214K Annually
Mid level
134K-214K Annually
Mid level
Cloud • Fintech • Food • Information Technology • Software • Hospitality
The Sr. Site Reliability Engineer will automate incident and change management processes, optimize efficiency, and collaborate with stakeholders to maintain reliability at Toast.
Top Skills: AWSAzureFirehydrantGCPGoJIRAPythonTerraform
17 Days Ago
Remote
DC, USA
Senior level
Senior level
Healthtech • Software
As a Senior Site Reliability Engineer, you'll design, implement, and maintain infrastructure for software applications, ensuring system performance and collaborating with engineering and operations teams.
Top Skills: AnsibleAWSAws CdkBashChefCloudwatchDatadogDockerEc2JavaScriptNode.jsPuppetPythonRdsS3TypescriptVpc

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account