Home › Companies › Reflectionai › Research Program Manager - Model Evals and Safety

Research Program Manager - Model Evals and Safety

Reflectionai · San Francisco · On Site · Active · Ashby

Job facts

Field	Value
Company	Reflectionai
Title	Research Program Manager - Model Evals and Safety
Normalized title	-
Department / team	Engineering / Engineering
Location	San Francisco, CA, United States
Work model	On Site
Employment type	Full Time
Salary	-
Status	active
ATS provider	Ashby
Posted / first seen	— / 2026-05-29
Changed / last seen	2026-05-29 / 2026-06-06

Related slices

Page	What it contains	Open
Company jobs	Active postings from Reflectionai.	Open
Company breakdowns	Role, location, ATS, and work model facets for this company.	Open
ATS provider jobs	Active postings observed through Ashby.	Open
Provider filtered search	The same provider as a filtered job collection.	Open
City jobs	Active postings in San Francisco.	Open
Department jobs	Active postings in Engineering.	Open
Work model jobs	Active On Site postings.	Open
Lifecycle events	Open, update, close, and reopen events for this posting.	Open
Original posting	Canonical source or apply URL captured from the ATS.	Open

Linked records

Company	Reflectionai
Source	dde42094-6e1e-4abd-8700-6037f9147ed6
ATS provider	Ashby

Description

Our Mission Reflection’s mission is to build open superintelligence and make it accessible to all . We’re developing open weight models for individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind, OpenAI, Google Brain, Meta, Character.AI, Anthropic and beyond. About the Role Research Program Managers at Reflection are high-leverage leaders and operators who embed directly with research and infrastructure teams to accelerate the pace of frontier model development. They are not project trackers. They are force multipliers who bring clarity to ambiguity, drive decisions when the path forward is unclear, and ensure that the work happening across multiple teams connects into a coherent whole. This is a foundational role. Reflection is building model evals and safety from the ground up, and this RPM will be at the center of that effort. You won't be stepping into an established function with existing processes and tooling. You will be the person who figures out what this function needs to look like, stands it up, and makes it real. That means defining the evaluation frameworks, building the operational infrastructure for model safety, establishing the processes that connect evals to the model development lifecycle, and laying the groundwork for how Reflection interfaces with the broader safety ecosystem. This is 0-to-1 work in its purest form. You bring a first-responder mentality. When things go sideways, you don't wait to be asked. You jump in, assess the situation, cut through noise, align the people who need to be aligned, and drive resolution. What You'll Do Build the foundational infrastructure for model evals and safety at Reflection. Define the evaluation frameworks, tooling requirements, and operational processes that will underpin how we assess model capabilities, risks, and readiness for release. Stand up model safety operations as a function, including establishing the workflows, review cadences, and decision frameworks that connect safety evaluation to the model development and release lifecycle. Partner with research and engineering leads across pre-training, mid-training, and post-training to embed safety and evaluation checkpoints into the development process in a way that is rigorous without being a bottleneck. Drive the scoping and prioritization of eval science and eval infrastructure investments, working with technical leads to determine what to build in-house, what to adopt, and where to invest research effort. Establish Reflection's engagement with the external safety ecosystem, including third-party assessments, academic partnerships, and industry safety frameworks. Represent the company's safety posture to external stakeholders with credibility and clarity. Create visibility and reporting structures that give leadership a clear, honest picture of model safety status, evaluation coverage, and open risks, so they can make informed decisions at the pace the business requires. Champion a culture of blameless post-mortems and continuous learning, turning every safety-relevant finding into a concrete improvement to our systems and processes. About You 7+ years of experience in technical program management, research operations, or ML engineering, with demonstrated experience standing up new functions, teams, or programs from scratch. Familiar with the landscape of model evaluation and AI safety, including evaluation methodologies, red-teaming, alignment research, and the evolving regulatory and industry safety ecosystem. You don't need to be a safety researcher, but you need to understand the space well enough to make sound judgments about what matters and what to prioritize. Deep enough technically to engage with researchers and engineers on topics like model behavior, evaluation design, data pipelines, and safety-critical system architecture. You follow the technical thread and you know when something doesn't add up. Proven ability to build structures where none exists. You've taken ambiguous mandates and turned them into functioning programs with clear ownership, measurable outcomes, and durable processes. Strong stakeholder management skills spanning deeply technical ICs, research leadership, and external partners. You build trust through competence and follow-through. Excited to build from zero to one. We are a small, fast-moving team and this role will help define how model safety and evaluation works at Reflection. Motivated by enabling researchers and engineers to build the world's most capable open-weight AI systems, responsibly. What We Offer: We believe that to build superintelligence that is truly open, you need to start at the foundation. Joining Reflection means building from the ground up as part of a small talent-dense team. You will help define our future as a company, and help define the frontier of open foundational models. We want you to do the most impactful work of your career with the confidence that you and the people you care about most are supported. Top-tier compensation: Salary and equity structured to recognize and retain the best talent globally. Health & wellness: Comprehensive medical, dental, vision, life, and disability insurance. Life & family: Fully paid parental leave for all new parents, including adoptive and surrogate journeys. Financial support for family planning. Benefits & balance: paid time off when you need it, relocation support, and more perks that optimize your time. Opportunities to connect with teammates: lunch and dinner are provided daily. We have regular off-sites and team celebrations.

Full job record

Job ID	0fe0df40665e17cc743b60cc38f3d956606995c1
Org ID	83b4dbeb-3efd-46c3-bff0-8d3e2c88f32e
Source ID	dde42094-6e1e-4abd-8700-6037f9147ed6
Board ID	dde42094-6e1e-4abd-8700-6037f9147ed6
Provider	ashby
Provider Job Key	659593ec-6cd2-4ae4-884b-9b31427662c6
Title	Research Program Manager - Model Evals and Safety
Normalized Title	—
Status	active
Active	yes
Location Text	San Francisco
Department	Engineering
Team	Engineering
Employment Type	full_time
Workplace Type	on_site
Remote Policy	—
Country	United States
Region	CA
City	San Francisco
Salary Raw	—
Salary Min	—
Salary Max	—
Salary Currency	—
Salary Period	—
Source URL	https://jobs.ashbyhq.com/reflectionai/659593ec-6cd2-4ae4-884b-9b31427662c6
Apply URL	https://jobs.ashbyhq.com/reflectionai/659593ec-6cd2-4ae4-884b-9b31427662c6/application
First Seen At	2026-05-29 07:09:54Z
Last Seen At	2026-06-06 09:36:38Z
Last Checked At	2026-06-06 09:36:38Z
Last Changed At	2026-05-29 07:09:54Z
Inactive At	—
Source Posted At	—
Source Updated At	—
Raw Payload Uri	s3://job-postings-prod-raw-590183727216/raw/provider=ashby/board=reflectionai/date=2026-06-06/2026-06-06T09-36-12-388Z-e87e15cfc907f71a6e75e1369d4f9b614f0e9bd012cb8bc4c91c0d043d46bf37.json

Event Fields

{
  "content_hash": "9a3efbbd2d9def8f3533e4a041223d9d79281dc9f4d71e13830605d080adb27e",
  "source_hash": "b7fdb35667b1976c0322e0843f9186124d99a468ad095b63f06f7408d12b3152",
  "last_changed_at": "2026-05-29T07:09:54.591Z",
  "active_status": "active"
}

Parsed Structured

{
  "language": "en",
  "location": {
    "raw": "San Francisco",
    "city": "San Francisco",
    "region": "CA",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.75
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-06T09:36:38.516Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en",
    "location": {
      "raw": "San Francisco",
      "city": "San Francisco",
      "region": "CA",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.75
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": null,
  "salary_period": null,
  "workplace_type": "on_site",
  "salary_currency": null
}

Extensions

{}

Native Structured

{
  "id": "659593ec-6cd2-4ae4-884b-9b31427662c6",
  "team": "Engineering",
  "title": "Research Program Manager - Model Evals and Safety",
  "jobUrl": "https://jobs.ashbyhq.com/reflectionai/659593ec-6cd2-4ae4-884b-9b31427662c6",
  "address": null,
  "applyUrl": "https://jobs.ashbyhq.com/reflectionai/659593ec-6cd2-4ae4-884b-9b31427662c6/application",
  "isListed": true,
  "isRemote": false,
  "location": "San Francisco",
  "updatedAt": null,
  "apiVersion": "ashby-non-user-graphql-v1",
  "department": "Engineering",
  "publishedAt": null,
  "workplaceType": "OnSite",
  "employmentType": "FullTime",
  "secondaryLocations": []
}

Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/0fe0df40665e17cc743b60cc38f3d956606995c1?include=descriptionJSON

GET https://api.bluedoor.sh/job-postings/v1/orgs/83b4dbeb-3efd-46c3-bff0-8d3e2c88f32eJSON

GET https://api.bluedoor.sh/job-postings/v1/sources/dde42094-6e1e-4abd-8700-6037f9147ed6JSON

GET https://api.bluedoor.sh/job-postings/v1/jobs/0fe0df40665e17cc743b60cc38f3d956606995c1/eventsJSON

Docs · Get an API key