TL;DR:
Imubit is looking for a Sr. Site Reliability Engineer to help disrupt the refining and chemical industries with breakthrough machine learning technologies.
About us:
At Imubit, we’re not just optimizing industrial processes - we’re redefining how entire industries operate. As the pioneer of Closed Loop AI Optimization (AIO), we are leading the charge in transforming refining, chemical, cement, and mineral mining plants with AI-driven automation. Our Optimizing Brain™ Solution puts the power of AI directly into the hands of engineers, enabling them to build and deploy their own multipurpose models to unlock new levels of efficiency, profitability, and sustainability.
Seven of the top ten U.S. refiners trust Imubit, with our solution deployed in over 90 high-value applications, delivering real-time process optimization that drives margins, reduces emissions, and builds the AI-savvy workforce of the future. Co-founded by a Google Fellow and award-winning data scientist, Imubit brings together domain experts from industry leaders like Exxon, Shell, Holcim, and FLSmidth. Backed by tier-1 venture capital firms such as Insight Partners and Alpha Wave, we are setting a new standard for industrial AI.
Our mission is simple but bold: helping the world’s leading industrial companies solve their most complex challenges, maximize long-term profitability, and future-proof their operations in an era of rapid change. If you’re ready to push boundaries and shape the future of industrial intelligence, now’s the time to join us.
We are looking for:
You, a top-notch Sr. Site Reliability Engineer, who will design and support Imubit’s cloud infrastructure. As part of this, you will work to optimize deployment processes and keep systems running. You will work with a variety of cloud technologies, automation, and infrastructure-as-code. Additionally, our SREs keep an ever-watchful eye on our system's capacity and performance. Much of our time is spent optimizing existing systems, building infrastructure, and reducing repetitive work through automation.
You will also play a critical role in incident management, swiftly identifying and resolving issues to minimize downtime and ensure seamless operations. Collaboration is key in this role, as you will work closely with software developers, DevOps engineers, and other stakeholders to implement robust solutions and drive continuous improvement. As a proactive member of our team, you will stay updated with the latest industry trends and best practices, applying this knowledge to enhance our infrastructure's resilience and scalability. Your contributions will directly impact the reliability and efficiency of our services, making you an integral part of our success.
In this position, you will:
- Design, deploy, and maintain Imubit’s cloud infrastructure to provide high uptime, scalability, and security.
- Leverage public cloud services and tools to improve the efficiency and reliability of our services and workflows.
- Architect and manage cross-cloud network infrastructure (e.g. subnets, routing tables, IPSec VPNs, Transit Gateways, firewall rules).
- Engage in and improve the whole lifecycle of services, from inception and design, through deployment, operation, and refinement.
- Participate in infrastructure on-call rotation and respond in a timely manner.
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
- 5+ years owning and managing production cloud infrastructure, with a strong focus on AWS services (EKS, RDS, IAM, S3, VPC, Lambda, CloudWatch)
- Deep, hands-on expertise operating Kubernetes clusters in production, including k8s and GitOps tooling (e.g. ArgoCD, Helm, Kustomize, and Karpenter)
- Proven experience architecting, automating, and maintaining infrastructure using Terraform and modern IaC practices
- Strong experience with observability stacks (e.g. Grafana, Prometheus, Loki, New Relic, Splunk); able to define and instrument actionable SLOs and alerts
- Proficiency in Python, Go, or equivalent programming languages for automation and tooling; able to read, debug, and suggest improvements through code review
- Expertise in developing CI/CD pipelines and managing version control systems (e.g. Gitlab, Github) to support modern DevOps workflows
- Experience managing production databases (e.g. PostgreSQL, CloudNativePG), including configuration, monitoring, and performance optimization
- Demonstrated ownership of critical systems and a track record of improving uptime, scalability, and performance through automation and systematic problem-solving
- Experience applying security best practices and managing secrets using tools such as HashiCorp Vault and AWS Secrets Manager to ensure secure, compliant infrastructure
- Excellent communication and collaboration skills with the ability to mentor peers and influence architecture decisions
- B.Sc. / B.Eng. / B.Tech. in Computer Science or equivalent
- 8+ years of total industry experience, including at least 3 years at companies with fewer than 300 employees
- Production experience configuring and operating multi-region architectures, with a focus on failover and disaster recovery
Imubit provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability or genetics. In addition to federal law requirements, Imubit complies with applicable state and local laws governing nondiscrimination in employment in every location in which the company has facilities. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.
Imubit does not accept or retain unsolicited CVs or phone calls and/or respond to them or to any third party representing job seekers.
No visa sponsorship is available for this position.
Top Skills
Similar Jobs
What you need to know about the Charlotte Tech Scene
Key Facts About Charlotte Tech
- Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
- Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
- Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
- Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
- Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus



