Zyte Logo

Zyte

Machine Learning Engineer - Web Data Quality

Posted 24 Days Ago
Be an Early Applicant
In-Office or Remote
Hiring Remotely in Rio de Janeiro, Rio, Rio de Janeiro
Mid level
In-Office or Remote
Hiring Remotely in Rio de Janeiro, Rio, Rio de Janeiro
Mid level
You will design and implement AI systems to assess and improve the quality of web datasets, collaborating with data, product, and engineering teams.
The summary above was generated by AI

At Zyte, we make the world’s web data accessible to everyone. Our technology powers data extraction at scale, helping businesses and researchers unlock the full potential of the web.

We’re a remote-first, multicultural team of engineers, data scientists, and innovators who believe in curiosity, collaboration, and continuous learning. If you’re passionate about building reliable AI systems and improving the quality of web data, we’d love to hear from you.

About the Role

As a Machine Learning Engineer (Web Data Quality), you’ll design and implement intelligent systems that automatically detect, measure, and improve the quality of large-scale web datasets. You’ll work at the intersection of data science, AI, and distributed systems, collaborating closely with product, engineering, and data teams to make data accuracy measurable, scalable, and actionable.


RequirementsWhat You’ll Do
  • Develop and deploy ML models for anomaly detection, schema drift, and content validation
  • Build and improve data quality pipelines leveraging modern data and MLOps tools
  • Design and optimize embeddings and GenAI models to enhance data consistency
  • Collaborate with engineers to integrate AI systems into production workflows
  • Conduct experiments, evaluate performance, and iterate for continuous improvement
  • Stay up to date on AI/ML and GenAI research to guide innovation within Zyte
Required
  • 3+ years of experience in Machine Learning / Data Science / AI Engineering
  • Strong Python skills and experience with ML frameworks (PyTorch, TensorFlow, scikit-learn)
  • Experience with data validation, anomaly detection, or data quality systems
  • Familiarity with data pipelines (Airflow, Spark, or similar)
  • Understanding of model evaluation, metrics, and deployment best practices
  • Excellent problem-solving, communication, and collaboration skills
Preferred
  • Experience with LangChain, LlamaIndex, or GenAI model orchestration
  • Familiarity with data labeling tools and active learning approaches
  • Contributions to open-source or public ML projects
  • Experience working in a remote, cross-functional team environment

Benefits
  • 35 days of paid time off
  • Health & wellness support
  • Inclusive and supportive team environment
  • Attend conferences and meet with team members from across the globe.
  • Work with cutting-edge open source technologies and tools

Top Skills

Airflow
Python
PyTorch
Scikit-Learn
Spark
TensorFlow

Similar Jobs

2 Hours Ago
Remote or Hybrid
Brazil
3-10K Annually
Mid level
3-10K Annually
Mid level
Artificial Intelligence • Hardware • Information Technology • Security • Software • Cybersecurity • Big Data Analytics
The role involves developing software solutions for public safety customers, troubleshooting issues, and improving product functionality within a technical team.
Top Skills: .NetC#GitMicrosoft Sql ServerVisual Studio
6 Hours Ago
Easy Apply
Remote
12 Locations
Easy Apply
Senior level
Senior level
Artificial Intelligence • Healthtech • Information Technology • Software • Conversational AI • Generative AI • Automation
Join the Collectly Product Scaling Team to develop and maintain integrations with medical systems, ensuring product scalability and reliability while collaborating with product teams.
Top Skills: CeleryGitlabPostgresPythonRedisSeleniumSQLSqlalchemy
22 Hours Ago
Remote
2 Locations
Senior level
Senior level
Artificial Intelligence • Cloud • Fintech • Professional Services • Software • Analytics • Financial Services
The Regional Marketing Director for LATAM will develop and execute marketing strategies to drive revenue growth, manage budgets, and lead a team. Key responsibilities include leadership, collaboration with sales, and implementing account-based marketing programs.
Top Skills: Business Intelligence ToolsMarketing AutomationWeb Analytics

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account