NVIDIA

Senior Applied Research Scientist, Multimodal Retrieval

Posted 12 Days Ago

Be an Early Applicant

In-Office or Remote

2 Locations

224K-357K

Expert/Leader

In-Office or Remote

2 Locations

224K-357K

Expert/Leader

As a Senior Applied Research Scientist, you will develop and deploy deep learning models for multimodal retrieval, build pipelines, mentor team members, and publish research.

The summary above was generated by AI

NVIDIA’s Retriever team is seeking a Senior Applied Research Scientist with experience researching, developing, and deploying deep learning models at scale across a range of modalities. You’ll join a team of Applied Research Scientists, Machine Learning and MLOps Engineers working on the next generation of retrieval pipelines for RAG, with a focus on the ingestion of modalities beyond text.

At NVIDIA we’re building the framework upon which production RAG systems are based. We have contributed to top research models in the text embedding space, topping the MTEB leaderboard, Vidore V1/V2 and have developed commercially viable versions of these models for use in production systems by our customers. Come be a part of our world-class team building the future of Retrieval.

What you’ll be doing:

Working with our team of researchers to develop efficient and performant models and pipelines that extract text content from images, video, audio and other modalities.
Building vision pipelines for document ingestion, including page layout analysis, object detection, and OCR.
Exploring and crafting datasets, metrics, experiments, and validation scripts to develop standard methodologies for research. These methodologies will offer customers clear guidance on which models and pipelines to apply in specific contexts.
Helping ML Engineers scale pipelines to production capability through the development of NVIDIA Inference Microservices (NIMs) and blueprints which demonstrate how to deploy NIMs in a pipeline effectively.
Writing papers, blog posts, documentation and trainings that help customers understand and take advantage of our research.
Keeping up to date with the latest developments in Retrieval across academia and industry.

What we need to see:

Candidates with a Master's, Ph.D. or equivalent experience in retrieval or multimodal research are preferred, along with a track record of publication in leading conferences like CVPR, ICCV, ECCV, KDD, etc.
Hands-on experience developing computer vision models and pipelines, with preference for document-focused tasks such as layout analysis, table or figure detection, and OCR. Competitive results in computer vision competitions on Kaggle or similar platforms is a plus.
An understanding of the state of the art in retrieval research, with a focus on multimodal content retrieval.
10+ years of experience developing multimodal systems across a range of models and platforms. Information retrieval experience is a big plus.
Knowledge of best practices in batching, streaming, and scaling of ingestion pipelines to support real-world applications.
Excellent Python programming skills and a strong understanding of the Python deep learning ecosystem (PyTorch, Tensorflow, MXNet, etc).
An ability to share and communicate your ideas clearly through blog posts, papers, kernels, GitHub, etc.
Strong communication and interpersonal skills are essential, as well as the capability to collaborate within a dynamic, distributed team. A history of mentoring junior engineers and interns is a plus.

Location is flexible and the team is remotely situated, focusing on NA/EU time zones.

GPU computing is the most productive and pervasive platform for deep learning and AI. It begins with the most advanced GPUs and the systems and software we build on top of them. We integrate and optimize every deep learning framework. We work with most major technology providers and support a broad range of Fortune 500 companies in their machine and deep learning needs. With deep learning, we can teach AI to do almost anything. New internet services, like Google Assistant, have learned speech from sound and provide a more natural way to access information. Self-driving cars use deep learning to recognize the space the car inhabits, the lanes in which it drives, and the objects to avoid. In healthcare, neural networks trained with millions of medical images can find clues in MRIs that until now could only be found through invasive biopsies. In recommendation systems, we learn how to understand users' desires and serve them what they truly are looking for. These are just a few examples. AI will spur a wave of social progress unmatched since the Industrial Revolution.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000 USD - 356,500 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until September 2, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Top Skills

Mxnet

Python

PyTorch

TensorFlow

Similar Jobs

Cedar

Software Engineer

37 Seconds Ago

Easy Apply

Remote or Hybrid

United States

Easy Apply

185K-215K

Mid level

185K-215K

Mid level

Fintech • Healthtech • Software

The Software Engineer III will lead innovation in payment solutions, enhance checkout processes, mentor peers, and implement ML features to improve patient affordability in healthcare.

Top Skills: DjangoPythonReactTypescript

NBCUniversal

Mobile Engineer - React Native

4 Minutes Ago

Remote or Hybrid

Orlando, FL, USA

Mid level

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development

As a Software Engineer II, develop mobile applications for GolfPass, collaborate on features, and engage in SCRUM while ensuring high-quality coding.

Top Skills: KotlinReact NativeRest ApisSwiftTypescript

Atlassian

Senior Program Manager

5 Minutes Ago

In-Office or Remote

San Francisco, CA, USA

124K-195K Annually

Senior level

124K-195K Annually

Senior level

Cloud • Information Technology • Productivity • Security • Software • App development • Automation

A Senior Program Manager will oversee AI-powered solutions within Talent Acquisition, driving project lifecycles, stakeholder engagement, and efficiency improvements.

Top Skills: AIIcimsLinkedin RecruiterRecruiting ToolsSeekout

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus