bluedoor data·Job Postings API·bluedoor.sh ↗

HomeCompaniesSesameML Model Serving Engineer

ML Model Serving Engineer

Sesame · San Francisco · On Site · Active · Ashby

Job facts

FieldValue
CompanySesame
TitleML Model Serving Engineer
Normalized title-
Department / teamSoftware / Software
LocationSan Francisco, CA, United States
Work modelOn Site
Employment typeFull Time
Salary-
Statusactive
ATS providerAshby
Posted / first seen / 2026-05-29
Changed / last seen2026-05-29 / 2026-06-20

Related slices

PageWhat it containsOpen
Company jobsActive postings from Sesame.Open
Company breakdownsRole, location, ATS, and work model facets for this company.Open
ATS provider jobsActive postings observed through Ashby.Open
Provider filtered searchThe same provider as a filtered job collection.Open
City jobsActive postings in San Francisco.Open
Department jobsActive postings in Software.Open
Work model jobsActive On Site postings.Open
Lifecycle eventsOpen, update, close, and reopen events for this posting.Open
Original postingCanonical source or apply URL captured from the ATS.Open

Linked records

CompanySesame
Sourcefadca3b3-d0b9-42cc-9362-5135e2ab8a21
ATS providerAshby

Description

About Sesame Sesame believes in a future where computers are lifelike - with the ability to see, hear, and collaborate with us in ways that feel natural and human. With this vision, we're designing a new kind of computer, focused on making voice agents part of our daily lives. Our team brings together founders from Oculus and Ubiquity6, alongside proven leaders from Meta, Google, and Apple, with deep expertise spanning hardware and software. Join us in shaping a future where computers truly come alive. Responsibilities: Turbocharge our serving layer, consisting of a variety of LLM, speech, and vision models. Partner with ML infrastructure and training engineers to build a fast, cost-effective, accurate, and reliable serving layer to power a new consumer product category. Modify and extend LLM serving frameworks like VLLM and SGLang to take advantage of the latest techniques in high-performance model serving. Work with the training team to identify opportunities to produce faster models without sacrificing quality. Use techniques like in-flight batching, caching, and custom kernels to speed up inference. Find ways to reduce model initialization times without sacrificing quality. Required Qualifications: Expert in some differentiable array computing framework, preferably PyTorch. Expert in optimizing machine learning models for serving reliably at high throughput, with low latency. Significant systems programming experience; ex. Experience working on high-performance server systems—you’d be just as comfortable with the internals of VLLM as you would with a complex PyTorch codebase. Significant performance engineering experience; ex. Bottleneck analysis in high-scale server systems or profiling low-level systems code. Always up to date on the latest techniques for model serving optimization. Preferred Qualifications: Familiarity with high-performance LLM serving; ex. experience with VLLM, SGlang deployment, and internals. Experience with a public cloud platform such as GCP, AWS, or Azure. Experience deploying and scaling inference workloads in the cloud using Kubernetes, Ray, etc. You like to ship and have a track record of leading complex multi-month projects without assistance. You’re excited to learn new things and work in a multitude of roles. Sesame is committed to a workplace where everyone feels valued, respected, and empowered. We welcome all qualified applicants, embracing diversity in race, gender, identity, orientation, ability, and more. We provide reasonable accommodations for applicants with disabilities. Contact [email protected] for assistance. Full-time Employee Benefits: 401 (k) max employer match: 3.5% of compensation 100% employer-paid health, vision, and dental benefits for you and your dependents Unlimited PTO and sick time Flexible spending account with employer matching up to $1,650/year (medical FSA) Guardian Employee Assistance Program (EAP) Opportunity to share in the company's success with competitive stock options Benefits do not apply to contingent/contract workers.

Full job record

Job ID3b60dad745d184d81b7855341a02a0ad7201a838
Org ID7ba82021-65aa-45a8-82cb-c064fc1f5f6a
Source IDfadca3b3-d0b9-42cc-9362-5135e2ab8a21
Board IDfadca3b3-d0b9-42cc-9362-5135e2ab8a21
Providerashby
Provider Job Key35793528-2b5c-47b3-9422-eaced2b69f63
TitleML Model Serving Engineer
Normalized Title
Statusactive
Activeyes
Location TextSan Francisco
DepartmentSoftware
TeamSoftware
Employment Typefull_time
Workplace Typeon_site
Remote Policy
CountryUnited States
RegionCA
CitySan Francisco
Salary Raw
Salary Min
Salary Max
Salary Currency
Salary Period
Source URLhttps://jobs.ashbyhq.com/sesame/35793528-2b5c-47b3-9422-eaced2b69f63
Apply URLhttps://jobs.ashbyhq.com/sesame/35793528-2b5c-47b3-9422-eaced2b69f63/application
First Seen At2026-05-29 07:10:34Z
Last Seen At2026-06-20 09:58:01Z
Last Checked At2026-06-20 09:58:01Z
Last Changed At2026-05-29 07:10:34Z
Inactive At
Source Posted At
Source Updated At
Raw Payload Uris3://job-postings-prod-raw-590183727216/raw/provider=ashby/board=sesame/date=2026-06-20/2026-06-20T09-57-44-137Z-07f0b132d53111c9f93fbe96111558c980c6f3f9df07709a72668e96183372de.json
Event Fields
{
  "content_hash": "5efd5b0e70ded3453d84832931116df617a0366e8b83f02a05bc9163867b37b9",
  "source_hash": "d6eb0059a9d54d94f8102135e518b431770497acf9391c66e3c79d35e6cd24de",
  "last_changed_at": "2026-05-29T07:10:34.885Z",
  "active_status": "active"
}
Parsed Structured
{
  "dedupe": null,
  "language": "en",
  "location": {
    "raw": "San Francisco",
    "city": "San Francisco",
    "region": "CA",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.75
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-20T09:58:01.367Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en",
    "location": {
      "raw": "San Francisco",
      "city": "San Francisco",
      "region": "CA",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.75
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": null,
  "salary_period": null,
  "workplace_type": "on_site",
  "salary_currency": null
}
Extensions
{}
Native Structured
{
  "id": "35793528-2b5c-47b3-9422-eaced2b69f63",
  "team": "Software",
  "title": "ML Model Serving Engineer",
  "jobUrl": "https://jobs.ashbyhq.com/sesame/35793528-2b5c-47b3-9422-eaced2b69f63",
  "address": null,
  "applyUrl": "https://jobs.ashbyhq.com/sesame/35793528-2b5c-47b3-9422-eaced2b69f63/application",
  "isListed": true,
  "isRemote": false,
  "location": "San Francisco",
  "updatedAt": null,
  "apiVersion": "ashby-non-user-graphql-v1",
  "department": "Software",
  "publishedAt": null,
  "workplaceType": "OnSite",
  "employmentType": "FullTime",
  "secondaryLocations": [
    {
      "location": "New York "
    },
    {
      "location": "Bellevue"
    }
  ]
}
Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/3b60dad745d184d81b7855341a02a0ad7201a838?include=descriptionJSON
GET https://api.bluedoor.sh/job-postings/v1/orgs/7ba82021-65aa-45a8-82cb-c064fc1f5f6aJSON
GET https://api.bluedoor.sh/job-postings/v1/sources/fadca3b3-d0b9-42cc-9362-5135e2ab8a21JSON
GET https://api.bluedoor.sh/job-postings/v1/jobs/3b60dad745d184d81b7855341a02a0ad7201a838/eventsJSON