Thomson Reuters

Applied Scientist I

Posted 6 Days Ago

Be an Early Applicant

Hybrid

2 Locations

Junior

Hybrid

2 Locations

Junior

As an Applied Scientist I, you will design and conduct evaluations of LLMs, build automated evaluation tools, collaborate with teams, and prototype new metrics.

The summary above was generated by AI

Are you passionate about advancing the science of evaluating large language models and intelligent agents? Join Thomson Reuters Labs, where we experiment, build, and deliver cutting-edge AI systems that empower professionals worldwide.

Our flagship AI assistant, CoCounsel, helps legal, tax, and business professionals work smarter. We’re expanding our LLM Evaluation team, focused on developing automated, scalable, and trustworthy evaluation frameworks that measure model reasoning, reliability, and alignment.

What We Do

At Thomson Reuters Labs, we blend applied research with real-world impact. Our scientists work on projects spanning LLM reasoning, benchmarking, grounding, and agentic behavior—all aimed at ensuring our AI systems are effective, explainable, and robust.

We believe that rigorous evaluation is the foundation of responsible AI. This role offers the opportunity to push the boundaries of auto-evaluation, LLM-as-a-judge, and agentic evaluation methodologies, influencing how AI systems are measured and improved at scale.

About the Role

As an Applied Scientist I, you will:

Design and Conduct Evaluations: Develop and execute evaluation pipelines for LLMs and agentic systems, assessing reasoning, factual accuracy, and alignment.
Automate and Scale: Build tools and frameworks for automatic evaluation, including synthetic dataset creation, LLM-as-a-judge workflows, and continuous benchmarking systems.
Collaborate and Translate: Partner with applied scientists, ML engineers, and product managers to translate evaluation results into model improvements and product insights.
Research and Experiment: Prototype new evaluation metrics, contribute to internal reports, and support publications or presentations on evaluation methods.
Champion Best Practices: Promote reproducibility, transparency, and ethical AI evaluation within the team and broader organization.

About You

You’re a great fit for this role if your background includes:

PhD in Computer Science, Artificial Intelligence, Machine Learning, or a related field (exceptional Master’s candidates with equivalent experience will be considered).
Research or hands-on experience with large language models, NLP evaluation, or agent-based AI systems.
Strong understanding of LLM performance measurement, prompt evaluation, and reliability testing.
Proficiency in Python and familiarity with ML libraries such as PyTorch, Transformers, and LangChain.
Comfort with experimental design, data analysis, and communicating technical findings clearly.

Preferred Qualifications

Experience with LLM evaluation frameworks (e.g., OpenAI Evals, HELM, LM Harness, or custom auto-eval tools).
Familiarity with retrieval-augmented generation (RAG), tool-using agents, or agentic evaluation methodologies.
Experience in cloud-based ML development (AWS, Azure, or GCP).
Record of publications or preprints in top-tier venues (e.g., NeurIPS, ACL, EMNLP, ICLR) or equivalent research contributions.
Interest in Responsible AI, fairness, and interpretability research.

#LI-AB3

What’s in it For You?

Hybrid Work Model: We’ve adopted a flexible hybrid working environment (2-3 days a week in the office depending on the role) for our office-based roles while delivering a seamless experience that is digitally and physically connected.
Flexibility & Work-Life Balance: Flex My Way is a set of supportive workplace policies designed to help manage personal and professional responsibilities, whether caring for family, giving back to the community, or finding time to refresh and reset. This builds upon our flexible work arrangements, including work from anywhere for up to 8 weeks per year, empowering employees to achieve a better work-life balance.
Career Development and Growth: By fostering a culture of continuous learning and skill development, we prepare our talent to tackle tomorrow’s challenges and deliver real-world solutions. Our Grow My Way programming and skills-first approach ensures you have the tools and knowledge to grow, lead, and thrive in an AI-enabled future.
Industry Competitive Benefits: We offer comprehensive benefit plans to include flexible vacation, two company-wide Mental Health Days off, access to the Headspace app, retirement savings, tuition reimbursement, employee incentive programs, and resources for mental, physical, and financial wellbeing.
Culture: Globally recognized, award-winning reputation for inclusion and belonging, flexibility, work-life balance, and more. We live by our values: Obsess over our Customers, Compete to Win, Challenge (Y)our Thinking, Act Fast / Learn Fast, and Stronger Together.
Social Impact: Make an impact in your community with our Social Impact Institute. We offer employees two paid volunteer days off annually and opportunities to get involved with pro-bono consulting projects and Environmental, Social, and Governance (ESG) initiatives.
Making a Real-World Impact: We are one of the few companies globally that helps its customers pursue justice, truth, and transparency. Together, with the professionals and institutions we serve, we help uphold the rule of law, turn the wheels of commerce, catch bad actors, report the facts, and provide trusted, unbiased information to people all over the world.

DISCLAIMER

The above information in this description has been designed to indicate the general nature and level of work performed by employees within this classification. It is not designed to contain or be interpreted as a comprehensive inventory of all duties, responsibilities, and qualifications required of employees assigned to this job.

About Us

Thomson Reuters informs the way forward by bringing together the trusted content and technology that people and organizations need to make the right decisions. We serve professionals across legal, tax, accounting, compliance, government, and media. Our products combine highly specialized software and insights to empower professionals with the data, intelligence, and solutions needed to make informed decisions, and to help institutions in their pursuit of justice, truth, and transparency. Reuters, part of Thomson Reuters, is a world leading provider of trusted journalism and news.

We are powered by the talents of 26,000 employees across more than 70 countries, where everyone has a chance to contribute and grow professionally in flexible work environments. At a time when objectivity, accuracy, fairness, and transparency are under attack, we consider it our duty to pursue them. Sound exciting? Join us and help shape the industries that move society forward.

As a global business, we rely on the unique backgrounds, perspectives, and experiences of all employees to deliver on our business goals. To ensure we can do that, we seek talented, qualified employees in all our operations around the world regardless of race, color, sex/gender, including pregnancy, gender identity and expression, national origin, religion, sexual orientation, disability, age, marital status, citizen status, veteran status, or any other protected classification under applicable law. Thomson Reuters is proud to be an Equal Employment Opportunity Employer providing a drug-free workplace.

We also make reasonable accommodations for qualified individuals with disabilities and for sincerely held religious beliefs in accordance with applicable law. More information on requesting an accommodation here.

Learn more on how to protect yourself from fraudulent job postings here.

More information about Thomson Reuters can be found on thomsonreuters.com.

Top Skills

Langchain

Python

PyTorch

Transformers

Similar Jobs

Anduril

Project Manager

10 Minutes Ago

In-Office

Raleigh, NC, USA

113K-149K Annually

Junior

113K-149K Annually

Junior

Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense

The MRA Project Manager will oversee manufacturing readiness assessments and ensure compliance with production schedules in aerospace and defense technology. Responsibilities include managing cross-functional teams and maintaining client relationships.

Top Skills: JIRAMS OfficeMicrosoft Project

Lowe’s

Product Designer

4 Hours Ago

Hybrid

Charlotte, NC, USA

Mid level

Consumer Web • eCommerce • Information Technology • Retail • Software • Analytics • App development

In this role, you'll design intuitive digital experiences, lead design efforts, create prototypes, and collaborate with cross-functional teams to deliver high-quality design artifacts.

Top Skills: Design SystemsPrototypingUx/Ui DesignWireframing

Lowe’s

Lead Information Security Analyst - Disaster Recovery

4 Hours Ago

Hybrid

Charlotte, NC, USA

Senior level

Consumer Web • eCommerce • Information Technology • Retail • Software • Analytics • App development

Lead the Disaster Recovery function, overseeing the team's operations, establishing frameworks, coordinating exercises, and ensuring recoverability of critical services.

Top Skills: Automation/Orchestration ToolsAWSAzureCbcpCisspCmdbItil V4Itsm

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus