Home › Companies › Nuance Labs › Member of Technical Staff — RL Research
Member of Technical Staff — RL Research
Nuance Labs · Seattle, Washington · Active · $300,000–$400,000 / year · Greenhouse
Job facts
| Field | Value |
|---|---|
| Company | Nuance Labs |
| Title | Member of Technical Staff — RL Research |
| Normalized title | - |
| Department / team | Research |
| Location | Seattle, WA, United States |
| Work model | - |
| Employment type | - |
| Salary | $300,000–$400,000 / year |
| Status | active |
| ATS provider | Greenhouse |
| Posted / first seen | 2026-06-05 / 2026-06-06 |
| Changed / last seen | 2026-06-06 / 2026-06-06 |
Related slices
| Page | What it contains | Open |
|---|---|---|
| Company jobs | Active postings from Nuance Labs. | Open |
| Company breakdowns | Role, location, ATS, and work model facets for this company. | Open |
| ATS provider jobs | Active postings observed through Greenhouse. | Open |
| Provider filtered search | The same provider as a filtered job collection. | Open |
| City jobs | Active postings in Seattle. | Open |
| Department jobs | Active postings in Research. | Open |
| Lifecycle events | Open, update, close, and reopen events for this posting. | Open |
| Original posting | Canonical source or apply URL captured from the ATS. | Open |
Linked records
| Company | Nuance Labs |
| Source | 4d06c175-4ee5-4cda-ad2e-cc1de78b9519 |
| ATS provider | Greenhouse |
Description
About Nuance Labs
Nuance Labs is building photorealistic, real-time AI avatars with emotional intelligence: a full-duplex audiovisual system that can listen, speak, react, interrupt, and respond like a real person.
We're a Series A company ($60M raised) backed by Lightspeed, Accel, South Park Commons, NVentures, and Define Ventures, with PhDs from MIT, UW, Oxford, CMU, and Johns Hopkins, and industry experience from Apple, Meta, Amazon AGI, and Discord. The team is small, the work is real, and the problems are unsolved.
How Nuance Differentiates
Most conversational AI avatars today are hacks — a face slapped on a speech-to-speech pipeline, stuck in the uncanny valley: emotionless, mechanical, one-turn-at-a-time. Current systems take 2–5 seconds to respond; natural conversation requires sub-500ms. That's a 10x improvement, and it demands rethinking the entire stack.
That rethinking starts with full-duplex: an AI that listens and speaks simultaneously, perceives emotion in real time, and responds with a face that actually reflects it. It's an extremely hard problem, and we're developing foundation models designed for it from the ground up.
About the Role
We’re looking for a deeply technical Member of Technical Staff to own RL and post-training for large-scale omni models.
This role is broader than a traditional RL algorithm role. You will be expected to understand modern post-training methods and build the infrastructure needed to run them at scale. The work spans RL method development, rollout generation, reward modeling, policy optimization, evaluation, data feedback loops, serving, observability, and distributed execution.
You will build Nuance’s RL/post-training stack from 0→1 and scale it from 1→10. That means turning rapidly evolving research ideas into reliable training systems: defining the abstractions, choosing or modifying frameworks, wiring together rollout workers and trainers, building reward/evaluation loops, debugging failure modes, and making the system fast enough for researchers to iterate.
For Nuance, post-training is not limited to text. Our models are omni from the ground up: audio, video, language, and real-time full-duplex interaction. We need RL and post-training methods that improve interactive behavior, timing, interruption, emotional response, audiovisual coherence, and real-time conversational quality.
This is a high-ownership role with direct impact on how Nuance models improve after pretraining.
What You’ll Own
Build Nuance’s RL/post-training stack from 0→1: rollout generation, policy optimization, reward/reference model serving, data feedback loops, evaluation, checkpointing, observability, and debugging.
Develop and scale post-training methods such as PPO, GRPO, DPO, rejection sampling, RLHF/RLAIF, online RL, and model-based data improvement.
Design the systems abstractions that connect research ideas to production-scale RL runs: trainers, rollout workers, reward models, evaluators, data queues, experience buffers, and checkpoint promotion.
Build evaluation and feedback loops for omni behavior: turn-taking, interruption, timing, emotional response, audiovisual coherence, instruction following, and real-time interaction quality.
Optimize the end-to-end post-training loop across rollout throughput, serving latency, GPU utilization, policy update efficiency, queueing, checkpoint overhead, and research iteration speed.
Evolve the platform as algorithms, model architectures, reward definitions, data sources, and evaluation methods change.
What We’re Looking For
Hands-on experience with RL, RLHF, RLAIF, post-training, alignment, or large-scale fine-tuning for modern foundation models.
Strong understanding of RL/post-training methods: policy optimization, reward modeling, preference optimization, rejection sampling, KL control, evaluation, and data feedback loops.
Ability to reason about model behavior and training dynamics: reward hacking, unstable rewards, distribution shift, stale policies, mode collapse, over-optimization, noisy preferences, and evaluation mismatch.
Practical experience building or operating RL/post-training pipelines with frameworks such as verl, ms-swift, OpenRLHF, or equivalent internal systems, including integration with rollout serving systems such as vLLM.
Experience with large-scale training or inference systems, including rollout generation, model serving, batching, queueing, GPU utilization, checkpointing, and debugging.
Understanding of omni post-training for real-time audio-video-language interaction: temporal alignment, interruption, emotional response, and multimodal evaluation.
Strong software engineering fundamentals, curiosity, and adaptability to new RL algorithms, model architectures, serving systems, evaluation methods, and research ideas.
Bonus Points
Prior 0→1 experience building post-training systems, RL pipelines, agent training systems, evaluation platforms, or large-scale model improvement loops.
Experience with PPO, GRPO, DPO, online RL, RLHF/RLAIF, reward modeling, preference data, synthetic data generation, or model-based data improvement.
Experience with omni or multimodal post-training for audio-video-language models, especially long-context or real-time interactive systems.
Experience scaling mixed training/inference workloads across large GPU clusters.
Experience with adjacent areas such as distributed pretraining, data infrastructure, inference serving, simulation, human/AI feedback collection, or evaluation infrastructure.
Publications or substantial open-source contributions in RL, post-training, alignment, evaluation, ML systems, or model behavior.
Compensation
$300,000 – $400,000 base salary, plus meaningful equity. We think long-term ownership matters and structure equity accordingly.
Logistics
Location: In-person in Seattle, 5 days a week — we believe in the compounding value of working shoulder-to-shoulder
Health: HSA plan with ~$2,000 in company contributions — about 2x what most big tech companies offer
PTO: 15 days + public holidays, and we close for a full week over the holidays
Lunch, beverages, and snacks: On us, every workday — the kind of thing that makes you actually look forward to the workday
Commuter benefits
401K: In the works
Nuance Labs is an equal opportunity employer. We believe diverse teams build better AI.
Full job record
| Job ID | 9cf80643fab9792fa8dfca9f7a60e62d96c42fa0 |
| Org ID | b5cad4e8-d3e2-423b-934c-3898f78ddee7 |
| Source ID | 4d06c175-4ee5-4cda-ad2e-cc1de78b9519 |
| Board ID | 4d06c175-4ee5-4cda-ad2e-cc1de78b9519 |
| Provider | greenhouse |
| Provider Job Key | 4277561009 |
| Title | Member of Technical Staff — RL Research |
| Normalized Title | — |
| Status | active |
| Active | yes |
| Location Text | Seattle, Washington |
| Department | Research |
| Team | — |
| Employment Type | — |
| Workplace Type | — |
| Remote Policy | — |
| Country | United States |
| Region | WA |
| City | Seattle |
| Salary Raw | Compensation $300,000 – $400,000 base salary, plus meaningful equity |
| Salary Min | 300,000 |
| Salary Max | 400,000 |
| Salary Currency | USD |
| Salary Period | year |
| Source URL | https://job-boards.greenhouse.io/nuancelabs/jobs/4277561009 |
| Apply URL | https://job-boards.greenhouse.io/nuancelabs/jobs/4277561009 |
| First Seen At | 2026-06-06 07:33:06Z |
| Last Seen At | 2026-06-06 20:12:12Z |
| Last Checked At | 2026-06-06 20:12:12Z |
| Last Changed At | 2026-06-06 07:33:06Z |
| Inactive At | — |
| Source Posted At | 2026-06-05 21:13:11Z |
| Source Updated At | 2026-06-05 21:13:11Z |
| Raw Payload Uri | s3://job-postings-prod-raw-590183727216/raw/provider=greenhouse/board=nuancelabs/date=2026-06-06/2026-06-06T20-12-12-157Z-ffe23d7407e1e5ada742398a50f3dfa98b1687e7d7d3baed3187df60c4819845.json |
Event Fields
{
"content_hash": "7f52ab3800af1098a8ed64fcb249587ca6c86897942e39d3306bacc6b0199940",
"source_hash": "258ad63baabad8fce673941116e7791b4f91396b746328a572e64d806a8b0e01",
"last_changed_at": "2026-06-06T07:33:06.309Z",
"active_status": "active"
}Parsed Structured
{
"language": "en",
"location": {
"raw": "Seattle, Washington",
"city": "Seattle",
"region": "WA",
"country": "United States",
"is_remote": false,
"confidence": 0.85
},
"salary_max": 400000,
"salary_min": 300000,
"inferred_at": "2026-06-06T20:12:12.229Z",
"launch_scope": {
"reason": "english_us_canada",
"included": true,
"language": "en",
"location": {
"raw": "Seattle, Washington",
"city": "Seattle",
"region": "WA",
"country": "United States",
"is_remote": false,
"confidence": 0.85
},
"countries": [
"United States"
]
},
"remote_policy": null,
"salary_period": "year",
"workplace_type": null,
"salary_currency": "USD"
}Extensions
{}Native Structured
{
"title": "Member of Technical Staff — RL Research",
"offices": [
{
"id": 4030799009,
"name": "Seattle",
"location": null,
"child_ids": [],
"parent_id": null
}
],
"language": "en",
"location": {
"name": "Seattle, Washington"
},
"metadata": [],
"updated_at": "2026-06-05T17:13:11-04:00",
"departments": [
{
"id": 4031247009,
"name": "Research",
"child_ids": [],
"parent_id": null
}
],
"company_name": "Nuance Labs",
"requisition_id": 4162923009,
"first_published": "2026-06-05T17:13:11-04:00",
"application_deadline": null
}Get this page with API
Rendered from the bluedoor Job Postings API. Reproduce it:
GET https://api.bluedoor.sh/job-postings/v1/jobs/9cf80643fab9792fa8dfca9f7a60e62d96c42fa0?include=descriptionJSONGET https://api.bluedoor.sh/job-postings/v1/orgs/b5cad4e8-d3e2-423b-934c-3898f78ddee7JSONGET https://api.bluedoor.sh/job-postings/v1/sources/4d06c175-4ee5-4cda-ad2e-cc1de78b9519JSONGET https://api.bluedoor.sh/job-postings/v1/jobs/9cf80643fab9792fa8dfca9f7a60e62d96c42fa0/eventsJSON