Tenstorrent Inc. Logo

Tenstorrent Inc.

Staff Cloud Software Engineer, Cloud Infrastructure

Posted 17 Days Ago
Remote
Hiring Remotely in United States
100K-500K
Senior level
Remote
Hiring Remotely in United States
100K-500K
Senior level
Design and implement distributed systems for AI computing, collaborating across the application life cycle while ensuring effective deployment and operations in cloud environments.
The summary above was generated by AI

Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

This Staff Cloud Software position is looking to bring new specialized expertise into the team in the area of distributed high-performance and AI computing, especially in Kubernetes-based cloud native environments. You will be driving design, implementation, and integration of systems to support scaling compute capabilities seamlessly from single-host systems into exaflop-scale clusters.

This role is hybrid, based out of Santa Clara, CA or Austin, TX.

We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.


Responsibilities:

  • Design and drive implementation of distributed systems for AI computing applications in Cloud and novel supercomputing cluster environments
  • Hands-on software development, testing, integration, operations, and support
  • Closely collaborate with the team through the full stack and life cycle of AI data center applications, from data center design and rollout to MLOps
  • Operate within on-premises data centers and public cloud environments
  • Drive projects through their whole software development lifecycle, both on technical and non-technical side
  • Collaboration with both highly technical and non-technical stakeholders with differing backgrounds, being able to communicate highly complex topics to diverse audiences
  • Continuous improvement of engineering practices through code reviews and adoption of relevant techniques and technologies


Experience & Qualifications:

  • 10+ years of hands-on software engineering experience working with distributed systems in Cloud and/or HPC environments
  • 5+ years of experience working with clustered (multi-host) AI hardware and applications for training and inference
  • 5+ years of experience with Kubernetes clusters, including cluster and application deployment (e.g., CNI, CSI, Helm), operations, and development of extensions (e.g., Device plugins, Operators)
  • Strong working knowledge of Python and Go
  • Infrastructure as Code as a first-class citizen (e.g. Ansible)
  • Strong Git, GitOps, and CI/CD experience
  • Familiarity with performance requirement implications of AI/ML workloads, both inference and training
  • Familiarity with virtualization technologies and platforms
  • Hands-on experience with MLOps concepts and frameworks for end-to-end model training pipelines
  • Strong understanding of networking concepts – experience with network hardware configuration and management is a plus
  • Familiarity with security implications of multi-tenant environments on hardware, software, and networking level
  • Familiarity with observability, monitoring and alerting tools (e.g., Grafana, Prometheus, Loki)
  • Agile / lean software project management experience
  • Strong programming skills with years of experience in various programming languages; familiarity of both object oriented and functional programming
  • REST API development and integration experience – full-stack web development experience is a plus


Compensation for all engineers at Tenstorrent ranges from $100k - $500k including base and variable compensation targets. Experience, skills, education, background and location all impact the actual offer made.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

Due to U.S. Export Control laws and regulations, Tenstorrent is required to ensure compliance with licensing regulations when transferring technology to nationals of certain countries that have been licensing conditions set  by the U.S. government.

Our engineering positions and certain engineering support positions require access to information, systems, or technologies that are subject to U.S. Export Control laws and regulations, please note that citizenship/permanent residency, asylee and refugee information and/or documentation will be required and considered as Tenstorrent moves through the employment process.

If a U.S. export license is required, employment will not begin until a license with acceptable conditions is granted by the U.S. government.  If a U.S. export license with acceptable conditions is not granted by the U.S. government, then the offer of employment will be rescinded.

Top Skills

Ansible
Ci/Cd
Git
Gitops
Go
Grafana
Kubernetes
Mlops
Prometheus
Python

Similar Jobs

3 Days Ago
Remote
2 Locations
170K-250K
Senior level
170K-250K
Senior level
Software
As a Staff Software Engineer, you'll lead cloud infrastructure strategy, enhance scalability and reliability, mentor engineers, and collaborate across teams.
Top Skills: AWSAzureGCPGoJava
5 Days Ago
Remote
United States
204K-255K Annually
Expert/Leader
204K-255K Annually
Expert/Leader
Real Estate • Travel • PropTech
As a Staff Software Engineer, you'll enhance Airbnb's cloud infrastructure, improve performance, and ensure high availability of systems while collaborating with various engineering teams.
Top Skills: AWSAzureDockerGCPKubernetes
An Hour Ago
Remote
2 Locations
Senior level
Senior level
Fintech • Machine Learning • Social Impact • Software • Financial Services
The Senior Engineer I develops and enhances payment platform infrastructure, collaborates with teams, solves code challenges, and mentors other engineers.
Top Skills: GoPayment ProcessingReactUx

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account