Home › Companies › Troveo › Senior Machine Learning Engineer

Senior Machine Learning Engineer

Troveo · San Francisco, CA · Hybrid · Active · Ashby

Job facts

Field	Value
Company	Troveo
Title	Senior Machine Learning Engineer
Normalized title	-
Department / team	Engineering / Engineering
Location	San Francisco, CA, United States
Work model	Hybrid / Hybrid
Employment type	Full Time
Salary	-
Status	active
ATS provider	Ashby
Posted / first seen	— / 2026-05-29
Changed / last seen	2026-05-29 / 2026-06-06

Related slices

Page	What it contains	Open
Company jobs	Active postings from Troveo.	Open
Company breakdowns	Role, location, ATS, and work model facets for this company.	Open
ATS provider jobs	Active postings observed through Ashby.	Open
Provider filtered search	The same provider as a filtered job collection.	Open
City jobs	Active postings in San Francisco.	Open
Department jobs	Active postings in Engineering.	Open
Work model jobs	Active Hybrid postings.	Open
Lifecycle events	Open, update, close, and reopen events for this posting.	Open
Original posting	Canonical source or apply URL captured from the ATS.	Open

Linked records

Company	Troveo
Source	93c186f0-3c42-4b57-837c-455d4bd3a89c
ATS provider	Ashby

Description

About Troveo Troveo is building the next-generation data platform to train AI video models. Troveo offers the world’s largest library of AI video training data, featuring millions of hours of licensed video content. Our end-to-end data pipeline connects creators, rights holders, and AI research labs, enabling scalable, compliant, and innovative uses of video across for AI application and model development. We are an early-stage, high-growth venture backed by forward-thinking investors, and we are seeking an innovative strategic engineer to help us scale. Role Overview The Senior Machine Learning Engineer will play a central role in designing, building, and optimizing large-scale machine learning pipelines for AI video model training. You’ll work across the full ML lifecycle, from structuring massive datasets to deploying, evaluating, and training models in production. This is a hands-on, high-impact role for an engineer who thrives on scale, autonomy, and cross-functional collaboration. You will combine deep technical expertise with strong communication and business acumen, translating models into measurable costs, performance targets, and real-world outcomes. Key Responsibilities Data Curation & Indexing Pipelines Architect and implement large-scale pipelines for video ingestion, metadata extraction, and indexing using vector databases and embedding models to enable fast, semantic retrieval. Design annotation workflows integrating active learning, weak supervision, and human-in-the-loop systems to curate high-quality labeled datasets for video models. Contribute to optimizing data partitioning, sharding, and caching strategies to handle petabyte-scale video corpora, ensuring low-latency search and robust data lineage. Model Training & Evaluation Develop and fine-tune multimodal models (e.g., CLIP variants, transformer-based encoders) for video embeddings, scene segmentation, and relevance ranking using PyTorch and Hugging Face. Build evaluation frameworks with metrics like NDCG, mAP, and annotation consistency scores to iteratively improve search accuracy and annotation efficiency. Deploy models via containerized services with A/B testing and monitoring for drift detection in production search and annotation pipelines. Collaborate with Product and Operations teams to translate ML performance into business insights and cost implications. Infrastructure & Optimization Scale ML infrastructure on AWS, leveraging multi-GPU clusters and distributed training to accelerate embedding computation and indexing jobs. Implement testing and deployment processes across large distributed systems. Fine-tune OSS models. Working knowledge in training large models is a plus. Implement automated CI/CD for model versioning, hyperparameter tuning, and resource orchestration to minimize compute costs and maximize GPU utilization. Profile and tune systems for bottlenecks in vector similarity search, batch annotation, and real-time querying. Cross-Functional Collaboration Partner with product, research, and data teams to align ML outputs with business KPIs, such as search latency targets and annotation throughput. Translate technical trade-offs (e.g., recall vs. precision in embeddings) into actionable insights for stakeholders, fostering adoption in video discovery features. Work closely with data engineers, research scientists, and product teams to align model performance with strategic business goals. Communicate technical concepts clearly to both technical and non-technical stakeholders. Take ownership of project outcomes in a fast-paced, startup environment. Qualifications & Experience 6+ years in ML engineering, with a focus on information retrieval, embedding systems, or data annotation pipelines. Proven track record building scalable indexing and search infrastructure, including vector stores and similarity search algorithms. Expertise in Python and PyTorch for core model development; hands-on experience with Hugging Face Transformers for multimodal embeddings and fine-tuning. Working experience with video, computer vision, and multi-modal LLMs. Hands-on experience deploying models in production environments and measuring model accuracy. Proficiency in ML ops tools (e.g., MLflow, Weights & Biases) for experimentation, versioning, and deployment. Hands-on experience with production ML deployment, evaluation metrics for retrieval/annotation tasks, and cost-optimized scaling on cloud platforms like AWS. Strong analytical skills for dissecting performance in large distributed systems; familiarity with multi-GPU training and vector databases preferred. Excellent communication to bridge technical depth with strategic priorities in collaborative settings. Nice to Have Prior experience training video models or working with video-based datasets. Demonstrated expertise in GPU optimization and large-scale compute performance tuning. A blend of startup agility and big tech rigor. Contributions to open source development and projects Experience working with search ranking algorithms. Location & Compensation Location: Strong preference for candidates based in the San Francisco Bay Area. Compensation: $200,000 – $400,000 base salary + equity. Why Join Troveo? Work at the cutting edge of AI, video, and large-scale data infrastructure. Build systems that directly power the next generation of AI video models. Collaborate with a world-class team of engineers, researchers, and industry experts. High autonomy, high impact, your work will shape the foundation of our platform. Competitive compensation with meaningful equity upside.

Full job record

Job ID	fee8fa150535a8b8c439ae166e3c3b71d88a936a
Org ID	68f50b99-fa05-4850-801e-5d8d61cd5c4f
Source ID	93c186f0-3c42-4b57-837c-455d4bd3a89c
Board ID	93c186f0-3c42-4b57-837c-455d4bd3a89c
Provider	ashby
Provider Job Key	2ca418dd-b8cb-465b-8d0a-7d4f12efbb27
Title	Senior Machine Learning Engineer
Normalized Title	—
Status	active
Active	yes
Location Text	San Francisco, CA
Department	Engineering
Team	Engineering
Employment Type	full_time
Workplace Type	hybrid
Remote Policy	hybrid
Country	United States
Region	CA
City	San Francisco
Salary Raw	—
Salary Min	—
Salary Max	—
Salary Currency	—
Salary Period	—
Source URL	https://jobs.ashbyhq.com/troveo/2ca418dd-b8cb-465b-8d0a-7d4f12efbb27
Apply URL	https://jobs.ashbyhq.com/troveo/2ca418dd-b8cb-465b-8d0a-7d4f12efbb27/application
First Seen At	2026-05-29 06:11:56Z
Last Seen At	2026-06-06 09:22:08Z
Last Checked At	2026-06-06 09:22:08Z
Last Changed At	2026-05-29 06:11:56Z
Inactive At	—
Source Posted At	—
Source Updated At	—
Raw Payload Uri	s3://job-postings-prod-raw-590183727216/raw/provider=ashby/board=troveo/date=2026-06-06/2026-06-06T09-22-01-490Z-ea178dc5048a10a6ebdcd1876cc1cea57bc9e62c80af4dee43f05e4c234d145f.json

Event Fields

{
  "content_hash": "6cd1100d18bbae85fc33822c44c480981413c4d574ed9e6b37dc320854c9bd09",
  "source_hash": "a5c4e429652400a821ecbd4e8da092aef557aadc901c6e9811891bb1aae472a5",
  "last_changed_at": "2026-05-29T06:11:56.690Z",
  "active_status": "active"
}

Parsed Structured

{
  "language": "en",
  "location": {
    "raw": "San Francisco, CA",
    "city": "San Francisco",
    "region": "CA",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.9
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-06T09:22:08.539Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en",
    "location": {
      "raw": "San Francisco, CA",
      "city": "San Francisco",
      "region": "CA",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.9
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": "hybrid",
  "salary_period": null,
  "workplace_type": "hybrid",
  "salary_currency": null
}

Extensions

{}

Native Structured

{
  "id": "2ca418dd-b8cb-465b-8d0a-7d4f12efbb27",
  "team": "Engineering",
  "title": "Senior Machine Learning Engineer",
  "jobUrl": "https://jobs.ashbyhq.com/troveo/2ca418dd-b8cb-465b-8d0a-7d4f12efbb27",
  "address": null,
  "applyUrl": "https://jobs.ashbyhq.com/troveo/2ca418dd-b8cb-465b-8d0a-7d4f12efbb27/application",
  "isListed": true,
  "isRemote": false,
  "location": "San Francisco, CA",
  "updatedAt": null,
  "apiVersion": "ashby-non-user-graphql-v1",
  "department": "Engineering",
  "publishedAt": null,
  "workplaceType": "Hybrid",
  "employmentType": "FullTime",
  "secondaryLocations": []
}

Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/fee8fa150535a8b8c439ae166e3c3b71d88a936a?include=descriptionJSON

GET https://api.bluedoor.sh/job-postings/v1/orgs/68f50b99-fa05-4850-801e-5d8d61cd5c4fJSON

GET https://api.bluedoor.sh/job-postings/v1/sources/93c186f0-3c42-4b57-837c-455d4bd3a89cJSON

GET https://api.bluedoor.sh/job-postings/v1/jobs/fee8fa150535a8b8c439ae166e3c3b71d88a936a/eventsJSON

Docs · Get an API key