NVIDIA Logo

NVIDIA

Senior Infrastructure and Build Systems Engineer

Reposted 2 Days Ago
Be an Early Applicant
In-Office or Remote
Hiring Remotely in Santa Clara, CA
152K-288K Annually
Senior level
In-Office or Remote
Hiring Remotely in Santa Clara, CA
152K-288K Annually
Senior level
The role involves designing and maintaining CI/CD pipelines, building infrastructure, and ensuring security compliance for deep learning models, collaborating with teams to integrate workflows effectively.
The summary above was generated by AI

We are now seeking a Senior Infrastructure and Build Systems Engineer for NVIDIA AI TensorRT-LLM team. This is a unique opportunity to take full ownership of the critical systems that power our engineering innovation. You and the team will be responsible for the entire infrastructure/DevOps landscape, from our CI/CD pipelines to our build systems to product security, driving efficiency and reliability across the organization. You will work with autonomy to design and implement the best solutions and collaborate with external partners to achieve our goals. If you're passionate about infrastructure, automation, observability, and compliance, we want you with us at one of the most innovative companies in the world!

What you'll be doing:

  • Building and maintaining infrastructure from first principles needed to deliver TensorRT LLM

  • Maintain CI/CD pipelines to automate the build, test, and deployment process and build improvements on the bottlenecks. Managing tools and enabling automations for redundant manual workflows via Github Actions, Gitlab, Terraform, etc

  • Enable performing scans and handling of security CVEs for infrastructure components

  • Improve the modularity of our build systems using CMake

  • Use AI to help build automated triaging workflows

  • Extensive collaboration with cross-functional teams to integrate pipelines from deep learning frameworks and components is essential to ensuring seamless deployment and inference of deep learning models on our platform.

What we need to see:

  • Masters degree or equivalent experience

  • 3+ years of experience in Computer Science, computer architecture, or related field

  • Ability to work in a fast-paced, agile team environment

  • Excellent Bash, CI/CD, Python programming and software design skills, including debugging, performance analysis, and test design.

  • Experience with CMake.

  • Background with Security best practices for releasing libraries.

  • Experience in administering, monitoring, and deploying systems and services on GitHub and cloud platforms. Support other technical teams in monitoring operating efficiencies of the platform, and responding as needs arise.

  • Highly skilled in Kubernetes and Docker/containerd. Automation expert with hands-on skills in frameworks like Ansible & Terraform. Experience in AWS, Azure or GCP

Ways to stand out from the crowd:

  • Experience contributing to a large open-source deep learning community - use of GitHub, bug tracking, branching and merging code, OSS licensing issues handling patches, etc.

  • Experience in defining and leading the DevOps strategy (design patterns, reliability and scaling) for a team or organization.

  • Experience driving efficiencies in software architecture, creating metrics, implementing infrastructure as code and other automation improvements.

  • Deep understanding of test automation infrastructure, framework and test analysis.

  • Excellent problem solving abilities spanning multiple software (storage systems, kernels and containers) as well as collaborating within an agile team environment to prioritize deep learning-specific features and capabilities within Triton Inference Server, employing advanced troubleshooting and debugging techniques to resolve complex technical issues.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most experienced and hard-working people in the world working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you. Come help us build the real-time, efficient computing platform driving our success in the dynamic and quickly growing field Deep Learning and Artificial Intelligence.

#LI-Hybrid

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until February 6, 2026.

This posting is for an existing vacancy. 

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Top Skills

AI
Ansible
AWS
Azure
Bash
Ci/Cd
Cmake
Docker
GCP
Git
Kubernetes
Python
Tensorrt
Terraform

Similar Jobs

57 Minutes Ago
Easy Apply
Remote or Hybrid
United States
Easy Apply
Senior level
Senior level
Legal Tech • Software • Generative AI
As a Senior Customer & Social Media Marketer, you will create customer advocacy programs and manage social media presence to highlight customer success stories, driving engagement and referrals while collaborating with various teams.
Top Skills: Ai-Driven SolutionsB2B SaasFacebookLinkedin)Social Media Platforms (InstagramTiktok
57 Minutes Ago
Easy Apply
Remote
USA
Easy Apply
186K-219K Annually
Senior level
186K-219K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
As a Senior Software Engineer at Coinbase, you will develop full-stack solutions to facilitate the movement of money in crypto, ensuring reliability and excellent user experience.
Top Skills: BlockchainDistributed SystemsEthereumEvent-Driven PatternsEvm-Compatible ChainsFull-Stack EngineeringMicroservices Architecture
58 Minutes Ago
Easy Apply
Remote
USA
Easy Apply
186K-219K Annually
Senior level
186K-219K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Build and maintain scalable backend services, define APIs, lead project delivery, improve reliability, and participate in code reviews.
Top Skills: AWSCGoKafkaKubernetesMemcachedMongoDBMySQLPostgresPythonRabbitMQRedisRust

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account