Home › Companies › Sesame › ML Model Serving Engineer

ML Model Serving Engineer

Sesame · San Francisco · On Site · Active · Ashby

Job facts

Field	Value
Company	Sesame
Title	ML Model Serving Engineer
Normalized title	-
Department / team	Software / Software
Location	San Francisco, CA, United States
Work model	On Site
Employment type	Full Time
Salary	-
Status	active
ATS provider	Ashby
Posted / first seen	— / 2026-05-29
Changed / last seen	2026-05-29 / 2026-06-20

Related slices

Page	What it contains	Open
Company jobs	Active postings from Sesame.	Open
Company breakdowns	Role, location, ATS, and work model facets for this company.	Open
ATS provider jobs	Active postings observed through Ashby.	Open
Provider filtered search	The same provider as a filtered job collection.	Open
City jobs	Active postings in San Francisco.	Open
Department jobs	Active postings in Software.	Open
Work model jobs	Active On Site postings.	Open
Lifecycle events	Open, update, close, and reopen events for this posting.	Open
Original posting	Canonical source or apply URL captured from the ATS.	Open

Linked records

Company	Sesame
Source	fadca3b3-d0b9-42cc-9362-5135e2ab8a21
ATS provider	Ashby

Description

About Sesame Sesame believes in a future where computers are lifelike - with the ability to see, hear, and collaborate with us in ways that feel natural and human. With this vision, we're designing a new kind of computer, focused on making voice agents part of our daily lives. Our team brings together founders from Oculus and Ubiquity6, alongside proven leaders from Meta, Google, and Apple, with deep expertise spanning hardware and software. Join us in shaping a future where computers truly come alive. Responsibilities: Turbocharge our serving layer, consisting of a variety of LLM, speech, and vision models. Partner with ML infrastructure and training engineers to build a fast, cost-effective, accurate, and reliable serving layer to power a new consumer product category. Modify and extend LLM serving frameworks like VLLM and SGLang to take advantage of the latest techniques in high-performance model serving. Work with the training team to identify opportunities to produce faster models without sacrificing quality. Use techniques like in-flight batching, caching, and custom kernels to speed up inference. Find ways to reduce model initialization times without sacrificing quality. Required Qualifications: Expert in some differentiable array computing framework, preferably PyTorch. Expert in optimizing machine learning models for serving reliably at high throughput, with low latency. Significant systems programming experience; ex. Experience working on high-performance server systems—you’d be just as comfortable with the internals of VLLM as you would with a complex PyTorch codebase. Significant performance engineering experience; ex. Bottleneck analysis in high-scale server systems or profiling low-level systems code. Always up to date on the latest techniques for model serving optimization. Preferred Qualifications: Familiarity with high-performance LLM serving; ex. experience with VLLM, SGlang deployment, and internals. Experience with a public cloud platform such as GCP, AWS, or Azure. Experience deploying and scaling inference workloads in the cloud using Kubernetes, Ray, etc. You like to ship and have a track record of leading complex multi-month projects without assistance. You’re excited to learn new things and work in a multitude of roles. Sesame is committed to a workplace where everyone feels valued, respected, and empowered. We welcome all qualified applicants, embracing diversity in race, gender, identity, orientation, ability, and more. We provide reasonable accommodations for applicants with disabilities. Contact [email protected] for assistance. Full-time Employee Benefits: 401 (k) max employer match: 3.5% of compensation 100% employer-paid health, vision, and dental benefits for you and your dependents Unlimited PTO and sick time Flexible spending account with employer matching up to $1,650/year (medical FSA) Guardian Employee Assistance Program (EAP) Opportunity to share in the company's success with competitive stock options Benefits do not apply to contingent/contract workers.

Full job record

Job ID	3b60dad745d184d81b7855341a02a0ad7201a838
Org ID	7ba82021-65aa-45a8-82cb-c064fc1f5f6a
Source ID	fadca3b3-d0b9-42cc-9362-5135e2ab8a21
Board ID	fadca3b3-d0b9-42cc-9362-5135e2ab8a21
Provider	ashby
Provider Job Key	35793528-2b5c-47b3-9422-eaced2b69f63
Title	ML Model Serving Engineer
Normalized Title	—
Status	active
Active	yes
Location Text	San Francisco
Department	Software
Team	Software
Employment Type	full_time
Workplace Type	on_site
Remote Policy	—
Country	United States
Region	CA
City	San Francisco
Salary Raw	—
Salary Min	—
Salary Max	—
Salary Currency	—
Salary Period	—
Source URL	https://jobs.ashbyhq.com/sesame/35793528-2b5c-47b3-9422-eaced2b69f63
Apply URL	https://jobs.ashbyhq.com/sesame/35793528-2b5c-47b3-9422-eaced2b69f63/application
First Seen At	2026-05-29 07:10:34Z
Last Seen At	2026-06-20 09:58:01Z
Last Checked At	2026-06-20 09:58:01Z
Last Changed At	2026-05-29 07:10:34Z
Inactive At	—
Source Posted At	—
Source Updated At	—
Raw Payload Uri	s3://job-postings-prod-raw-590183727216/raw/provider=ashby/board=sesame/date=2026-06-20/2026-06-20T09-57-44-137Z-07f0b132d53111c9f93fbe96111558c980c6f3f9df07709a72668e96183372de.json

Event Fields

{
  "content_hash": "5efd5b0e70ded3453d84832931116df617a0366e8b83f02a05bc9163867b37b9",
  "source_hash": "d6eb0059a9d54d94f8102135e518b431770497acf9391c66e3c79d35e6cd24de",
  "last_changed_at": "2026-05-29T07:10:34.885Z",
  "active_status": "active"
}

Parsed Structured

{
  "dedupe": null,
  "language": "en",
  "location": {
    "raw": "San Francisco",
    "city": "San Francisco",
    "region": "CA",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.75
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-20T09:58:01.367Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en",
    "location": {
      "raw": "San Francisco",
      "city": "San Francisco",
      "region": "CA",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.75
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": null,
  "salary_period": null,
  "workplace_type": "on_site",
  "salary_currency": null
}

Extensions

{}

Native Structured

{
  "id": "35793528-2b5c-47b3-9422-eaced2b69f63",
  "team": "Software",
  "title": "ML Model Serving Engineer",
  "jobUrl": "https://jobs.ashbyhq.com/sesame/35793528-2b5c-47b3-9422-eaced2b69f63",
  "address": null,
  "applyUrl": "https://jobs.ashbyhq.com/sesame/35793528-2b5c-47b3-9422-eaced2b69f63/application",
  "isListed": true,
  "isRemote": false,
  "location": "San Francisco",
  "updatedAt": null,
  "apiVersion": "ashby-non-user-graphql-v1",
  "department": "Software",
  "publishedAt": null,
  "workplaceType": "OnSite",
  "employmentType": "FullTime",
  "secondaryLocations": [
    {
      "location": "New York "
    },
    {
      "location": "Bellevue"
    }
  ]
}

Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/3b60dad745d184d81b7855341a02a0ad7201a838?include=descriptionJSON

GET https://api.bluedoor.sh/job-postings/v1/orgs/7ba82021-65aa-45a8-82cb-c064fc1f5f6aJSON

GET https://api.bluedoor.sh/job-postings/v1/sources/fadca3b3-d0b9-42cc-9362-5135e2ab8a21JSON

GET https://api.bluedoor.sh/job-postings/v1/jobs/3b60dad745d184d81b7855341a02a0ad7201a838/eventsJSON

Docs · Get an API key