Tenstorrent Inc. Logo

Tenstorrent Inc.

Staff Engineer, HPC Systems Software

Posted 17 Days Ago
Easy Apply
In-Office or Remote
3 Locations
100K-500K Annually
Mid level
Easy Apply
In-Office or Remote
3 Locations
100K-500K Annually
Mid level
The Staff Engineer will design and maintain automated OS deployment pipelines, manage configuration management, troubleshoot OS issues, and collaborate with hardware teams for HPC systems.
The summary above was generated by AI

Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

We are seeking a HPC Systems Engineer to architect and maintain the operating system foundation that powers our global hardware design infrastructure. You'll own bare-metal provisioning pipelines, configuration-as-code systems, and OS lifecycle management across hundreds of compute nodes—ensuring hardware engineers have consistent, performant, and reliable systems. This role requires deep Linux expertise, automation mastery, and the ability to solve complex infrastructure problems at scale in a rapidly evolving startup environment.

This role is hybrid, based out of Austin, TX, Santa Clara, CA, or Toronto, CA.

We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.


Who You Are

  • Design and maintain automated OS deployment pipelines for bare-metal HPC clusters globally.
  • Manage large-scale configuration management using Ansible to ensure consistency across compute infrastructure.
  • Deploy and lifecycle manage RHEL and Ubuntu systems across diverse hardware platforms.
  • Implement infrastructure-as-code for repeatable, version-controlled system configurations.
  • Troubleshoot OS-level issues, optimize kernel parameters, and resolve system performance bottlenecks.
  • Collaborate with hardware design teams to standardize system configurations, toolchains, and development environments.
  • Build automation and tooling to streamline provisioning, patching, and system updates at scale

What You Bring

  • Experienced in RHEL and Ubuntu administration at HPC or large-scale compute environments.
  • Highly skilled in Ansible for automation and configuration management across hundreds of nodes.
  • Proficient with bare-metal provisioning systems (MAAS, Foreman, Cobbler, Warewulf, or similar).
  • Deep understanding of Linux system internals, networking, kernel tuning, and performance troubleshooting.
  • Familiar with HPC cluster architecture, workflows, and infrastructure-as-code practices.
  • Capable of diagnosing and resolving complex infrastructure issues independently in fast-paced environments.

Nice to Have

  • Hands-on experience with IBM Spectrum LSF or similar HPC workload managers.
  • Integration with commercial HPC storage platforms (Pure Storage, Weka, NetApp, DDN, Vast Data).
  • Deep exposure to EDA tools and hardware design workflows in semiconductor development.
  • Container technologies (Docker, Singularity, Podman) for reproducible compute environments.
  • Cluster monitoring and observability at scale using Prometheus, Grafana, and custom tooling.
  • Advanced provisioning techniques including PXE boot, kickstart, cloud-init, and BMC/IPMI integration.
  • Security hardening and compliance frameworks for multi-tenant HPC environments.
  • Python and bash scripting for production-level infrastructure automation.

Compensation for all engineers at Tenstorrent ranges from $100k - $500k including base and variable compensation targets. Experience, skills, education, background and location all impact the actual offer made.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

This offer of employment is contingent upon the applicant being eligible to access U.S. export-controlled technology.  Due to U.S. export laws, including those codified in the U.S. Export Administration Regulations (EAR), the Company is required to ensure compliance with these laws when transferring technology to nationals of certain countries (such as EAR Country Groups D:1, E1, and E2).   These requirements apply to persons located in the U.S. and all countries outside the U.S.  As the position offered will have direct and/or indirect access to information, systems, or technologies subject to these laws, the offer may be contingent upon your citizenship/permanent residency status or ability to obtain prior license approval from the U.S. Commerce Department or applicable federal agency.  If employment is not possible due to U.S. export laws, any offer of employment will be rescinded.

Top Skills

Ansible
Bash
Cobbler
Docker
Foreman
Grafana
Linux
Maas
Prometheus
Python
Rhel
Ubuntu
Warewulf

Similar Jobs

4 Hours Ago
Remote or Hybrid
United States
119K-222K Annually
Senior level
119K-222K Annually
Senior level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
Design and build a machine learning platform, deploy ML models, collaborate across teams, and establish monitoring standards for AI solutions.
Top Skills: AIAmazon BedrockAmazon SagemakerAWSDockerFeastMicroservicesMlRestful Apis
4 Hours Ago
Remote or Hybrid
United States
88K-163K Annually
Mid level
88K-163K Annually
Mid level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
Develop and maintain identity security training programs, collaborating cross-functionally to enhance training effectiveness and learner success while incorporating customer feedback and market trends.
Top Skills: Agile MethodologiesArticulate 360AsanaConfluenceDemo And Simulation ToolsIdentity SecurityLearning Management SystemsSaaSSlackTeamsVideo Production
4 Hours Ago
Remote or Hybrid
United States
115K-213K Annually
Mid level
115K-213K Annually
Mid level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
The Advisory Solutions Consultant will support sales teams by understanding customer needs, providing product demonstrations, and participating in the sales process, focusing on Identity Security solutions.
Top Skills: AWSAzureGCPJavaJSONLdapSQLXML

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account