Hatch Logo

Hatch

Cloud Infrastructure Engineer

Reposted 4 Days Ago
Remote
Hiring Remotely in United States
Senior level
Remote
Hiring Remotely in United States
Senior level
The role involves building scalable cloud infrastructure for AI products, managing DevOps practices, and collaborating with ML teams to optimize workflows.
The summary above was generated by AI

Cloud Infrastructure Engineer

MUST BE BASED IN NYC, No Relocation

Hybrid in SOHO

Not able to sponsor

About the Role

We’re looking for a Senior DevOps Engineer to join Hatch’s high-impact engineering team. This is a senior-level role focused on building resilient, secure, and scalable infrastructure to support both our core platform and AI-powered product lines. You'll partner with engineers, ML practitioners, and product leaders to ensure our systems can scale with the speed of our ambitions.

About Hatch
Hatch is a fast-moving team of builders solving real-world business problems with AI. We move quickly, take ownership, and care deeply about delivering outcomes. Our engineering culture prioritizes operational rigor, clean architecture, and velocity without compromising reliability. If you're energized by scale, speed, and owning infrastructure that powers AI workflows end-to-end — this is a role for you.

What You’ll Do
Infrastructure at Scale
•Evolve our cloud infrastructure (AWS & GCP) using infrastructure-as-code tools like
Terraform or Ansible.
•Implement systems that support the compute-heavy and storage-intensive needs
of machine learning and data processing pipelines.
•Manage scalable, secure, and cost-efficient environments across dev, staging, and
production.
•Participate in an on-call rotation.


ML Platform Support
•Collaborate with ML engineers to productionize models and manage workflows
across training, testing, and deployment stages.
•Implement infrastructure to support versioning, orchestration, and monitoring of
ML models in production (e.g. using tools like Kubeflow, SageMaker, VertexAI, or
custom pipelines).
•Optimize data pipelines and model serving infrastructure for low-latency and high-
throughput performance.


Reliability & Observability
•Drive the strategy for observability, logging, and alerting across distributed
systems.

•Lead incident response, root cause analysis, and system hardening for long-term
resiliency.
•Implement best practices for infrastructure security, container hardening, and
network architecture.


Platform Enablement
•Partner with engineering teams to bake DevOps best practices into the
development lifecycle.
•Build tooling and automation that improves developer velocity, release stability,
and system transparency.


What We’re Looking For
•3+ years of experience in DevOps, SRE, or platform engineering roles in high-
growth environments.
•3+ years of experience with AWS infrastructure and services, including networking,
IAM, ECS/EKS, and serverless computing.
•Strong experience with infrastructure-as-code (Terraform, Ansible) and CI/CD
tooling (GitHub Actions, ArgoCD, etc.).
•Experience supporting machine learning teams or MLOps platforms (e.g. model
training pipelines, feature stores, model registry, online inference).
•Strong knowledge of container orchestration (Kubernetes preferred) and
observability stacks (Prometheus, Grafana, Sentry, DataDog, New Relic, etc.).
•Proven ability to participate in architectural conversations and contribute to large-
scale infrastructure improvements.
•A bias toward simplicity, security, and reliability — you know when to build fast and
when to build right.
•Familiarity with at least one programming language; Python, Go, Erlang, Rust, etc.
•Exposure to agentic programming workflows.
•RHCE, RHCSA, or equivalent certifications preferred.


Why You Should Join
•Work at the intersection of infrastructure and machine learning at a company
building real AI products with urgency and purpose.
•Join a culture that expects technical leadership, fast decision-making, and
relentless curiosity.
•Partner with high-caliber engineers and product leaders in a tight-knit, fast-
executing environment

Top Skills

Ansible
Argocd
AWS
Ci/Cd
Datadog
Erlang
GCP
Github Actions
Go
Grafana
Kubeflow
Kubernetes
New Relic
Prometheus
Python
Rust
Sagemaker
Sentry
Terraform
Vertexai

Similar Jobs

5 Days Ago
Remote or Hybrid
New York, NY, USA
180K-220K Annually
Expert/Leader
180K-220K Annually
Expert/Leader
AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Lead technical direction and architecture for developer platforms, integrating AI into SDLC workflows that enhance developer productivity and software quality.
Top Skills: AtlassianAWSAws QuicksightAzureCi/CdDatadogGCPGithub EnterpriseGoKong Api GatewayNexusOauth2PythonSonarqubeSplunkTerraform Enterprise
3 Days Ago
In-Office or Remote
2 Locations
184K-357K Annually
Senior level
184K-357K Annually
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The role involves designing and building backend systems for AI tools, managing infrastructure, and collaborating with teams to enhance operational excellence.
Top Skills: Ai Agent FrameworksGoKubernetesPysparkPythonVector Databases
4 Days Ago
In-Office or Remote
2 Locations
184K-357K Annually
Senior level
184K-357K Annually
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The role involves architecting and managing infrastructure systems, integrating third-party partners, and driving operational excellence across cloud platforms.
Top Skills: Ci/CdDatabricksDelta LakeGoInfrastructure As CodeKubernetesPythonRubySpark

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account