NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern deep learning - the next era of computing - with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, we are increasingly known as “The AI Computing Company.” We're looking to grow our company and establish teams with the most thoughtful people in the world.
NVIDIA HGX, MGX and DGX systems deliver the world's leading solutions for enterprise AI infrastructure at scale. With their end-to-end performance and flexibility, these systems enable researchers and scientists to combine simulation, data analytics, and AI to drive scientific progress on the most powerful end-to-end AI supercomputing platforms. Are you ready to change the next generation of computing? Join us at the forefront of technological advancement.
What you’ll be doing:
Lead and drive system bringup for GPU-centric server platforms in factory and data center environments.
Design and implement end-to-end factory workflows, including firmware flashing sequences, security provisioning, and deployment of software mitigations.
Collaborate cross-functionally with data center architects, ODMs, and OEMs to define factory and data center requirements that ensure efficient and reliable production ramp.
Champion reliability, debuggability and optimization in firmware, diagnostic and deployment tool design.
Use AI tools to automate functionality and improve automation.
Troubleshoot at speed of light, working closely with system bring-up teams on next generation AI systems to debug and resolve issues during bringup and deployment.
What we need to see:
10+ years of experience in data center firmware/platform software development.
BS, MS, or PhD in EE, CS, or related technical field (or equivalent experience).
Deep, hands-on expertise of working with ODMs/CSPs, firmware update design and out-of-band management.
Proven track record of architecting and developing server firmware and diagnostic solutions for large-scale data center deployments.
Solid knowledge of hardware interfaces (USB, SMBus/I2C, PCIe) and protocols such as Redfish, MCTP, and PLDM.
Solid knowledge of debugging servers for early bring up.
Advanced skills in C/C++ and Python, with a hands-on approach to coding and debugging during hardware bring-up.
Strong communicator, excellent collaborator, and committed team player.
Self-starter with a problem-solving mindset who thrives in a fast-paced, complex technical environment.
Ways to stand out from the crowd:
Hands-on experience with ODMs/CSPs during system bring-up and volume deployment.
Deep familiarity with x86 or ARM system architecture.
Strong networking expertise with high-speed NICs, including bring-up and configuration in factory environment.
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative and autonomous, we want to hear from you!
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.You will also be eligible for equity and benefits.
Top Skills
Similar Jobs
What you need to know about the Charlotte Tech Scene
Key Facts About Charlotte Tech
- Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
- Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
- Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
- Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
- Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus