Groq Logo

Groq

Senior Staff Software Engineer, High Performance Inference System

Reposted 8 Days Ago
Remote
4 Locations
Entry level
Remote
4 Locations
Entry level
As a Software Engineer, develop real-time distributed compute frameworks for ultra-low latency AI inference. Collaborate on hardware-software optimization and ensure mission-critical reliability.
The summary above was generated by AI

About Groq

Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers the speed and scale they need. Headquartered in Silicon Valley, we are on a mission to make high performance AI compute more accessible and affordable. When real-time AI is within reach, anything is possible. Build fast.

Senior Staff Software Engineer  - High Performance Inference System

Team Missions and Mandates:

  • Build and operate real-time, distributed compute frameworks and runtimes to deliver planet-scale inference for LLMs and advanced AI applications at ultra-low latency, optimized for heterogeneous hardware and dynamic global workloads.
  • Develop deterministic, low-overhead hardware abstractions for thousands of synchronously coordinated GroqChips across a software-scheduled interconnection network. Prioritize fault tolerance, real-time diagnostics, ultra-low-latency execution, and mission-critical reliability.
  • Future-proof Groq's software stack for next-gen silicon, innovative multi-chip topologies, emerging form factors, and heterogeneous co-processors (e.g., FPGAs).
  • Foster collaboration across cloud, compiler, infra, data centers, and hardware teams to align engineering efforts, enable seamless integrations, and drive progress toward shared goals.
  • Reduce operational overhead, improve SLOs, and make tokens go brrrrrrrrrrr—positioning Groq as the largest inference powerhouse on earth. 🚀

Your code will run at the edge of physics—every clock cycle saved reduces latency for millions of users and extends Groq's lead in the AI compute race.

Signs We Look For:

  • You consistently ship high-impact, production-ready code while collaborating effectively with cross-functional teams.
  • You possess deep expertise in computer architecture, operating systems, algorithms, hardware-software interfaces, and parallel/distributed computing.
  • You've mastered system-level programming (C++, Rust, or similar) with emphasis on low-level optimizations and hardware-aware design.
  • You excel at profiling and optimizing systems for latency, throughput, and efficiency, with zero tolerance for wasted cycles or resources.
  • You're committed to automated testing and CI/CD pipelines, believing that "untested code is broken code."
  • You're deeply curious about system internals—from kernel-level interactions to hardware dependencies—and fearless enough to solve problems across abstraction layers down to the PCB traces.
  • You make pragmatic technical judgments, balancing short-term velocity with long-term system health.
  • You write empathetic, maintainable code with strong version control and modular design, prioritizing readability and usability for future teammates.
  • Nice to have: Experience shipping complex projects in fast-paced environments while maintaining team alignment and stakeholder support.
  • Nice to have: Expertise operating large-scale distributed systems for high-traffic internet services.
  • Nice to have: Experience deploying and optimizing machine learning (ML) or high-performance computing (HPC) workloads in production.
  • Nice to have: Hands-on optimization of performance-critical applications using GPUs, FPGAs, or ASICs (e.g., memory management, kernel optimization).
  • Nice to have: Familiarity with ML frameworks (e.g., PyTorch) and compiler tooling (e.g., MLIR) for AI/ML workflow integration.

Early-career engineers: If you've built high-performance systems in school/research and want to dive into cutting-edge hardware/software engineering, we'll invest in your growth to scale with the team.

The Ideal Candidate:

  • Initiates (without derailing): Spots opportunities to solve problems or improve processes—while staying aligned with team priorities.
  • Builds stuff that actually ships: Values "code in prod" over "perfect slides." Delivers real value instead of polishing whiteboard ideas.
  • Is a craftsmanship junkie: Always asks, "How can we make this better?" and loves diving into details.
  • Plays to win (together): Believes winning means everyone wins. Aligns goals with teammates and customers because no one succeeds alone.
  • Owns it from whiteboard to watts: Takes full responsibility—debug it, deploy it, celebrate with users (or fix it again). Ensures code stays fast, scales well, and takes ownership of outcomes.

This isn't your typical corporate job—it's a mission to redefine AI compute. If you're the kind of engineer who reads ISCA papers for fun and thinks "I can make that faster," this is your playground.

ISCA Papers to Read:

  • To learn more about how Groq achieves strong scaling via a software-scheduled interconnection network to maximize inference speed and throughput: A Software-defined Tensor Streaming Multiprocessor for Large-scale Machine Learning
  • To learn more about how the exotic, deterministic architecture of each individual chip unlocks massive compute bandwidth on legacy 14nm processes: Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads 

Logistical Requirements:

  • Authorized to work in Canada or United States
  • Available to work American Eastern or Pacific hours


Location: Some roles may require being located near or on our primary sites, as indicated in the job description.  

At Groq: Our goal is to hire and promote an exceptional workforce as diverse as the global populations we serve. Groq is an equal opportunity employer committed to diversity, inclusion, and belonging in all aspects of our organization. We value and celebrate diversity in thought, beliefs, talent, expression, and backgrounds. We know that our individual differences make us better.


Groq is an Equal Opportunity Employer that is committed to inclusion and diversity. Qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, gender, sexual orientation, gender identity, disability or protected veteran status.  We also take affirmative action to offer employment opportunities to minorities, women, individuals with disabilities, and protected veterans.

Groq is committed to working with qualified individuals with physical or mental disabilities. Applicants who would like to contact us regarding the accessibility of our website or who need special assistance or a reasonable accommodation for any part of the application or hiring process may contact us at:  [email protected].  This contact information is for accommodation requests only.  Evaluation of requests for reasonable accommodations will be determined on a case-by-case basis.

Top Skills

C++
Fpgas
Mlir
PyTorch
Rust

Similar Jobs

2 Hours Ago
Easy Apply
Remote
31 Locations
Easy Apply
132K-282K Annually
Senior level
132K-282K Annually
Senior level
Cloud • Security • Software • Cybersecurity • Automation
Lead the Personalization Platform team as an Engineering Manager, focusing on A/B testing, analytics, and personalized experiences. Manage team dynamics, drive technical strategy, and enhance project delivery while mentoring engineers to drive their growth.
Top Skills: A/B TestingAnalytics SystemsData PipelinesEvent TrackingFeature Flagging SystemsGitGitlabGoRuby
6 Hours Ago
Easy Apply
Remote
3 Locations
Easy Apply
165K-194K Annually
Mid level
165K-194K Annually
Mid level
Cloud • Security • Software • Cybersecurity • Automation
The Technical Architect oversees Professional Services projects, managing engagements from scoping to delivery, coordinating implementation, mentoring consultants, and providing technical expertise.
Top Skills: AnsibleCi/CdCloud ArchitectureDevOpsGitlabTerraform
Yesterday
Remote
Ottawa, ON, CAN
85K-225K Annually
Mid level
85K-225K Annually
Mid level
Big Data • Cloud • Healthtech • Software • Big Data Analytics
Software Engineers at Veeva will design, implement, and deliver cloud-based features while mentoring junior developers and ensuring code quality.
Top Skills: AspectjGitGradleHibernateJavaJenkinsJmsJunitLinuxLog4JMockitoMySQLSpringTomcat

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account