Articul8 AI

Senior Site Reliability Engineer (SRE) - (Brazil)

Posted Yesterday

Be an Early Applicant

Remote

2 Locations

Senior level

Remote

2 Locations

Senior level

The role involves ensuring system reliability for a GenAI SaaS, automating infrastructure, implementing monitoring solutions, and leading incident response efforts.

The summary above was generated by AI

About Us

Articul8 AI is at the forefront of Generative AI innovation, delivering cutting-edge SaaS products that transform how businesses operate. Our platform empowers organizations to leverage the power of artificial intelligence in a reliable, scalable, and secure environment.

Position Overview

We are seeking an experienced Site Reliability Engineer (SRE) to join our team and help ensure the reliability, performance, and scalability of our GenAI SaaS platform. As an SRE, you will bridge the gap between development and operations, implementing automation and best practices to maintain our service reliability objectives while supporting rapid innovation.

Key Responsibilities

Architect and maintain scalable, highly available infrastructure for our GenAI platform.
Design and implement robust monitoring, alerting, and observability solutions to proactively ensure system health and performance.
Automate deployment, scaling, and management of our cloud-native infrastructure, reducing toil and improving efficiency.
Define, measure, and improve Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to deliver outstanding service quality.
Participate in on-call rotations and provide rapid response to production incidents, minimizing downtime and user impact.
Collaborate closely with development teams to build reliable, scalable, and efficient systems for complex AI workloads.
Lead incident response efforts, conduct thorough post-mortems, and champion continuous improvement initiatives.
Optimize infrastructure for performance, scalability, and cost-effectiveness—especially for high-demand AI workloads.
Implement and enforce security best practices across all systems and environments.
Create and maintain comprehensive documentation, including runbooks and knowledge base articles, to foster a culture of shared knowledge.

QualificationsRequired

Bachelor's degree in Computer Science, Engineering, or related field, or equivalent practical experience
5+ years of experience in DevOps, SRE, or similar roles
Strong experience with cloud platforms (AWS, GCP, or Azure)
Proficiency in at least one programming/scripting language (Python, Go, Bash, etc.)
Hands-on experience with infrastructure as code tools (Terraform, CloudFormation, etc.)
Solid background in containerization technologies (Docker, Kubernetes)
Proven experience with monitoring and observability tools (Prometheus, Grafana, ELK stack, etc.)
Strong understanding of CI/CD pipelines and automation
Exceptional troubleshooting and problem-solving skills and ability to troubleshoot complex systems

Preferred

Experience supporting AI/ML systems in production
Knowledge of GPU infrastructure management and optimization
Familiarity with distributed systems and high-performance computing
Experience with database systems (SQL and NoSQL)
Certifications in cloud platforms (AWS, GCP, Azure)
Experience with chaos engineering and resilience testing
Knowledge of security best practices and compliance requirements

Ready to shape the future of resilient software systems? Apply now and help drive the reliability of tomorrow’s AI at Articul8 AI!

Top Skills

AWS

Azure

Bash

CloudFormation

Docker

Elk Stack

GCP

Grafana

Kubernetes

Prometheus

Python

Terraform

Similar Jobs

Coinbase

Senior Site Reliability Engineer

2 Days Ago

Remote

United States

186K-219K Annually

Senior level

186K-219K Annually

Senior level

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3

As a Senior Site Reliability Engineer, you will manage corporate IAM systems, develop cloud-native applications, and enhance automation while ensuring system reliability and security.

Top Skills: AnsibleAzure AdC#DockerDuoGoGoogle WorkspaceJavaKubernetesOktaPingPythonRubyTerraform

Coinbase

Senior Site Reliability Engineer

2 Days Ago

Remote

United States

140K-165K Annually

Senior level

140K-165K Annually

Senior level

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3

The Senior Site Reliability Engineer will enhance system reliability and observability, support cloud deployment optimizations, provide mentorship, and improve incident management while ensuring software quality and operational integrity.

Top Skills: AWSAzureDatadogDockerEc2GCPGoKibanaKubernetesRubyTerraform

Nexthink

Site Reliability Engineer

15 Hours Ago

Remote or Hybrid

Boston, MA, USA

Senior level

Artificial Intelligence • Big Data • Information Technology • Software

Lead Site Reliability Engineer responsible for high-performance cloud platform management, driving SRE processes, team leadership, and ensuring FedRAMP compliance.

Top Skills: AnsibleAWSAzureBashCloudFormationCrossplaneDockerGCPGitGitlabGoJenkinsKubernetesPythonTerraform

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus