Imubit Logo

Imubit

Sr. Site Reliability Engineer

Reposted 22 Days Ago
In-Office or Remote
Hiring Remotely in Houston, TX
Mid level
In-Office or Remote
Hiring Remotely in Houston, TX
Mid level
The Site Reliability Engineer designs and maintains cloud infrastructure at Imubit, optimizing deployment processes, managing incidents, and collaborating with teams to enhance system reliability and performance.
The summary above was generated by AI

TL;DR:

Imubit is looking for a Sr. Site Reliability Engineer to help disrupt the refining and chemical industries with breakthrough machine learning technologies.


About us:

At Imubit, we’re not just optimizing industrial processes - we’re redefining how entire industries operate. As the pioneer of Closed Loop AI Optimization (AIO), we are leading the charge in transforming refining, chemical, cement, and mineral mining plants with AI-driven automation. Our Optimizing Brain™ Solution puts the power of AI directly into the hands of engineers, enabling them to build and deploy their own multipurpose models to unlock new levels of efficiency, profitability, and sustainability.

Seven of the top ten U.S. refiners trust Imubit, with our solution deployed in over 90 high-value applications, delivering real-time process optimization that drives margins, reduces emissions, and builds the AI-savvy workforce of the future. Co-founded by a Google Fellow and award-winning data scientist, Imubit brings together domain experts from industry leaders like Exxon, Shell, Holcim, and FLSmidth. Backed by tier-1 venture capital firms such as Insight Partners and Alpha Wave, we are setting a new standard for industrial AI.

Our mission is simple but bold: helping the world’s leading industrial companies solve their most complex challenges, maximize long-term profitability, and future-proof their operations in an era of rapid change. If you’re ready to push boundaries and shape the future of industrial intelligence, now’s the time to join us.


We are looking for:

You, a top-notch Sr. Site Reliability Engineer, who will design and support Imubit’s cloud infrastructure. As part of this, you will work to optimize deployment processes and keep systems running. You will work with a variety of cloud technologies, automation, and infrastructure-as-code. Additionally, our SREs keep an ever-watchful eye on our system's capacity and performance. Much of our time is spent optimizing existing systems, building infrastructure, and reducing repetitive work through automation.

You will also play a critical role in incident management, swiftly identifying and resolving issues to minimize downtime and ensure seamless operations. Collaboration is key in this role, as you will work closely with software developers, DevOps engineers, and other stakeholders to implement robust solutions and drive continuous improvement. As a proactive member of our team, you will stay updated with the latest industry trends and best practices, applying this knowledge to enhance our infrastructure's resilience and scalability. Your contributions will directly impact the reliability and efficiency of our services, making you an integral part of our success.


In this position, you will:

  • Design, deploy, and maintain Imubit’s cloud infrastructure to provide high uptime, scalability, and security.
  • Leverage public cloud services and tools to improve the efficiency and reliability of our services and workflows.
  • Architect and manage cross-cloud network infrastructure (e.g. subnets, routing tables, IPSec VPNs, Transit Gateways, firewall rules).
  • Engage in and improve the whole lifecycle of services, from inception and design, through deployment, operation, and refinement.
  • Participate in infrastructure on-call rotation and respond in a timely manner.
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
 
Minimum Qualifications:
  • 5+ years owning and managing production cloud infrastructure, with a strong focus on AWS services (EKS, RDS, IAM, S3, VPC, Lambda, CloudWatch)
  • Deep, hands-on expertise operating Kubernetes clusters in production, including k8s and GitOps tooling (e.g. ArgoCD, Helm, Kustomize, and Karpenter)
  • Proven experience architecting, automating, and maintaining infrastructure using Terraform and modern IaC practices
  • Strong experience with observability stacks (e.g. Grafana, Prometheus, Loki, New Relic, Splunk); able to define and instrument actionable SLOs and alerts
  • Proficiency in Python, Go, or equivalent programming languages for automation and tooling; able to read, debug, and suggest improvements through code review
  • Expertise in developing CI/CD pipelines and managing version control systems (e.g. Gitlab, Github) to support modern DevOps workflows
  • Experience managing production databases (e.g. PostgreSQL, CloudNativePG), including configuration, monitoring, and performance optimization
  • Demonstrated ownership of critical systems and a track record of improving uptime, scalability, and performance through automation and systematic problem-solving
  • Experience applying security best practices and managing secrets using tools such as HashiCorp Vault and AWS Secrets Manager to ensure secure, compliant infrastructure
  • Excellent communication and collaboration skills with the ability to mentor peers and influence architecture decisions
 
Preferred Qualifications:
  • B.Sc. / B.Eng. / B.Tech. in Computer Science or equivalent
  • 8+ years of total industry experience, including at least 3 years at companies with fewer than 300 employees
  • Production experience configuring and operating multi-region architectures, with a focus on failover and disaster recovery


Imubit provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability or genetics. In addition to federal law requirements, Imubit complies with applicable state and local laws governing nondiscrimination in employment in every location in which the company has facilities. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.


Imubit does not accept or retain unsolicited CVs or phone calls and/or respond to them or to any third party representing job seekers.


No visa sponsorship is available for this position.


[email protected]

Top Skills

Ansible
AWS
Aws Secrets Manager
GCP
Git
Go
Grafana
Hashicorp Vault
Kubernetes
New Relic
Postgres
Prometheus
Python
Splunk
Terraform

Similar Jobs

20 Days Ago
Easy Apply
Remote
US
Easy Apply
124K-266K Annually
Senior level
124K-266K Annually
Senior level
Cloud • Security • Software • Cybersecurity • Automation
As a Senior Site Reliability Engineer, you'll ensure reliability and scalability of user-facing services, automate workflows, and uphold compliance standards for public sector services.
Top Skills: AnsibleAWSElkGCPGitlabGoGrafanaKubernetesPrometheusRubyTerraform
2 Days Ago
Remote
USA
130K-140K
Senior level
130K-140K
Senior level
Consumer Web • Digital Media • Software
The Senior Site Reliability Engineer will manage system incidents, enhance monitoring and database infrastructure, and collaborate on scalable systems to maintain reliability as usage scales.
Top Skills: AWSClickhouseKubernetesMySQLPostgresRedis
3 Days Ago
Remote
USA
Senior level
Senior level
Gaming • Mobile • Software
As a Senior Site Reliability Engineer, you will improve and optimize infrastructure services, manage build pipelines, troubleshoot incidents, and contribute to automation projects while mentoring junior team members.
Top Skills: AnsibleArtifactoryAWSCrossplaneDatadogElasticsearchEtcdGitlabGoGCPJaegerJenkinsKubernetesAzureMongoDBNatsPackerPostgresPythonRedisTerraformVault

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account