Tavus Logo

Tavus

Senior Data Engineer

Posted Yesterday
In-Office or Remote
2 Locations
150K-200K Annually
Senior level
In-Office or Remote
2 Locations
150K-200K Annually
Senior level
As a Senior Data Engineer, you will own the data strategy, build and optimize data pipelines, and ensure high-quality datasets for AI models, collaborating closely with ML engineers.
The summary above was generated by AI
About Us

Tavus is a research lab pioneering human computing. We’re building AI Humans: a new interface that closes the gap between people and machines, free from the friction of today’s systems. Our real-time human simulation models let machines see, hear, respond, and even look real—enabling meaningful, face-to-face conversations. AI Humans combine the emotional intelligence of humans with the reach and reliability of machines, making them capable, trusted agents available 24/7, in every language, on our terms.

Imagine a therapist anyone can afford. A personal trainer that adapts to your schedule. A fleet of medical assistants that can give every patient the attention they need. With Tavus, individuals, enterprises, and developers can all build AI Humans to connect, understand, and act with empathy at scale.

We’re a Series A company backed by world-class investors including Sequoia Capital, Y Combinator, and Scale Venture Partners.

Be part of shaping a future where humans and machines truly understand each other.

The Role

Data is the foundation of everything we build. We’re looking for a Senior Data Engineer who goes beyond pipelines and cleaning datasets. You’ll own our entire data strategy, from sourcing and curating to structuring and optimizing, ensuring our models and products are powered by the highest-quality data possible. You’re a true master of your craft including data sourcing, formatting, labeling, cleaning, and making use of our internal data. 

Your Mission 🚀
  • Be a data guru – You anticipate the data needs not just for today, but for the future. You know how to curate diverse, high-quality datasets to ensure AI models reach their full potential.

  • Influence AI model training – Your data work will directly impact AI model performance, efficiency, and inference accuracy. You will collaborate closely with ML engineers to optimize datasets for maximum AI effectiveness.

  • Own, build and scale the data pipeline. You will be highly involved in data sourcing, and expand and own the curation, filtering and preprocessing pipelines across a variety of data modalities.

  • Be a data hunter – Web scraping, third-party deals, unconventional sources—you’ll find, collect, and curate the best multimodal data (text, video, images) to power our models. Manage large-scale data procurement to ensure our models train on the highest quality information.

  • Be a video data craftsman - we’re building something truly unique based on a blend of video and audio data. Throwing data at the problem is not a solution here, but you should be up for the challenge of making it work! You will own this challenge and ensure that our video and audio datasets are structured for AI success. You will help us truly flesh out the capabilities of our SOTA models!

  • Optimize labeling & automation – You will own the data labeling process and build automated workflows to make cleaning, labeling, and structuring data as efficient as possible. Work closely with our data annotation teams to ensure high-quality labeled data for ML models.

  • Turn internal data into gold – Our own platform is a goldmine of insights—help us unlock and use it to drive smarter decisions and supercharge growth.

  • Speed + precision – Move fast, but don’t break data. Every pipeline, dataset, and workflow should be tight, efficient, and built to last.

What We’re Looking For 🔥
  • You don’t just maintain - you build. From zero to fully running pipelines, you make things happen. You can take charge of how we use internal data to make smarter decisions.

  • Extreme ownership - You own data strategy end-to-end, proactively solving what data we need, where to get it, and how to structure it for AI impact.

  • Strategic mindset – You think beyond pipelines—you anticipate data needs before they arise and help shape AI development at Tavus.

  • Previous work with LLMs, multimodal data, is a big plus. You know how to source, structure, and optimize data for real AI impact.

  • Automation expert – You know how to automate data cleaning, structuring, and labeling workflows for efficiency and scale.

  • ML-first mindset – You understand that better data = better models and structure datasets to maximize AI model accuracy.

  • Fast, but flawless. Speed matters, but so does accuracy. You balance both.

  • You don’t follow best practices—you create them. A lot of what we’re doing is new- you set the standard for how data should be done.

  • Technical expertise – You have strong experience with Python, SQL, and large-scale data processing tools.

Bonus Points if:
  • You have some previous work with LLMs, multimodal data. You know how to source, structure, and optimize data for real AI impact.

  • You have experience with in-house video data collection and relevant studio setups. You know best practices for multimodal video and audio data collection.

 

Benefits & Culture

When you join Tavus, you’re joining a diverse and supportive team. Our work is driven by our people, and our success is shared by all. This position has a flexible work schedule, unlimited PTO, competitive healthcare, and gear stipends, as well as plenty of fun. At the end of the day, we want Tavus to be a place for you to learn, directly drive impact, and work with a team you love.

To learn more about our team culture and benefits, check out our hiring page.

Tavus is growing fast, and we’d like you to grow with us. If you’re excited to get your hands dirty and help make machines more human, drop your resume and we’ll be in touch.

We are not looking for cultural fits, we are looking for culture creators. Diversity is what drives our success – it’s at the core of how we hire, communicate, and work. We are inclusive to all and combine our diverse backgrounds, skill sets, and perspectives to build the best experiences for our clients.

Top Skills

Large-Scale Data Processing Tools
Python
SQL

Similar Jobs

Yesterday
Easy Apply
Remote or Hybrid
United States
Easy Apply
Senior level
Senior level
Fintech • Mobile • Software • Financial Services
As a Senior Data Engineer, you will design and maintain data solutions to support risk management, ensuring high data integrity and driving strategic evolution of the data platform.
Top Skills: AirtableAnsibleApache AirflowApache KafkaCloudFormationDbtGithub ActionsGitlab Ci/CdPythonSnowflakeSQLTerraform
6 Days Ago
Remote or Hybrid
United States
60K-160K Annually
Mid level
60K-160K Annually
Mid level
Cloud • Insurance • Payments • Software • Business Intelligence • App development • Big Data Analytics
The Data Engineer will design and implement data pipelines, manage Looker, ensure efficient ETL processes, and collaborate with teams to deliver scalable data solutions.
Top Skills: BashBigQueryCi/CdDataflowGCPGitHelmKubernetesLookerPub/SubPythonSQLTerraform
14 Days Ago
Remote
USA
Senior level
Senior level
Software
The Senior Data Engineer will build and maintain analytics infrastructure, develop ETL processes, integrate data sources, mentor team members, and optimize data architecture.
Top Skills: AirflowAWSCi/CdCloudFormationKafkaPysparkPythonRedshiftSparkSQLTerraform

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account