bluedoor data·Job Postings API·bluedoor.sh ↗

HomeCompaniesSanas.AI IncMember of Technical Staff (Data Acquisition)

Member of Technical Staff (Data Acquisition)

Sanas.AI Inc · Palo Alto, CA, United States · On Site · Deleted · Rippling ATS

Job facts

FieldValue
CompanySanas.AI Inc
TitleMember of Technical Staff (Data Acquisition)
Normalized title-
Department / teamScience
LocationPalo Alto, CA, United States
Work modelOn Site
Employment typeFull Time
Salary-
Statusdeleted
ATS providerRippling ATS
Posted / first seen2026-04-06 / 2026-05-29
Changed / last seen2026-06-06 / 2026-06-03

Related slices

PageWhat it containsOpen
Company jobsActive postings from Sanas.AI Inc.Open
Company breakdownsRole, location, ATS, and work model facets for this company.Open
ATS provider jobsActive postings observed through Rippling ATS.Open
Provider filtered searchThe same provider as a filtered job collection.Open
City jobsActive postings in Palo Alto.Open
Department jobsActive postings in Science.Open
Work model jobsActive On Site postings.Open
Lifecycle eventsOpen, update, close, and reopen events for this posting.Open
Original postingCanonical source or apply URL captured from the ATS.Open

Linked records

CompanySanas.AI Inc
Source1fc1335f-581e-4138-ae2e-6e3d6c790876
ATS providerRippling ATS

Description

company Sanas is pioneering the future of human communication. Founded by a team of Stanford researchers and entrepreneurs with deep industry experience, Sanas has developed the world's first real-time speech AI platform capable of accent translation, noise cancellation, speech enhancement, cross-language communication, and more. Sanas makes conversations clearer, more inclusive, and more effective, removing barriers that prevent people from being understood, regardless of accent, background noise, or native language. Sanas is currently one of the fastest growing startups in Silicon Valley, growing from $16M to $50M ARR in 2025. The company's core business is profitable and is on track to end 2026 with >$120M ARR. Our team combines deep expertise in model innovation and systems engineering with a design-minded product engineering culture to build and ship cutting-edge AI models and experiences — entirely in-house. Sanas is a 180-strong team, established in 2020. In this short span, we've successfully secured over $100 million in funding. Our innovation has been supported by the industry's leading investors, including Insight Partners, Google Ventures, Quadrille Capital, General Catalyst, Quiet Capital, and other influential investors. Our reputation is further solidified by collaborations with numerous Fortune 100 companies. With Sanas, you're not just adopting a product; you're investing in the future of communication. If you’re looking to have a significant role in roadmapping and driving technical directions, if you’re looking to deploy challenging and big ideas without much overhead or slowness, if you're looking to leave your mark on an ambitious, generational mission to change how the worlds thinks about speech + AI, then Sanas is a well-suited place for you. role About Sanas Sanas is pioneering the future of human communication. Founded by a team of Stanford researchers and entrepreneurs with deep industry experience, Sanas has developed the world's first real-time speech AI platform capable of accent translation, noise cancellation, speech enhancement, cross-language communication, and more. Sanas makes conversations clearer, more inclusive, and more effective, removing barriers that prevent people from being understood, regardless of accent, background noise, or native language. Sanas is currently one of the fastest growing startups in Silicon Valley, growing from $16M to $50M ARR in 2025. The company's core business is profitable and is on track to end 2026 with >$120M ARR. Our team combines deep expertise in model innovation and systems engineering with a design-minded product engineering culture to build and ship cutting-edge AI models and experiences — entirely in-house. Sanas is a 180-strong team, established in 2020. In this short span, we've successfully secured over $100 million in funding. Our innovation has been supported by the industry's leading investors, including Insight Partners, Google Ventures, Quadrille Capital, General Catalyst, Quiet Capital, and other influential investors. Our reputation is further solidified by collaborations with numerous Fortune 100 companies. With Sanas, you're not just adopting a product; you're investing in the future of communication. If you’re looking to have a significant role in roadmapping and driving technical directions, if you’re looking to deploy challenging and big ideas without much overhead or slowness, if you're looking to leave your mark on an ambitious, generational mission to change how the worlds thinks about speech + AI, then Sanas is a well-suited place for you. About the Role Your mission is to build and operate the ingestion systems that turn the open web and large-scale audio sources into reliable, well-structured corpora for training Sanas's frontier speech models. You'll own the machinery that acquires, extracts, filters, versions, and delivers audio data to our training pipelines — and you'll work directly with our research scientists to close the loop between what we collect and how it moves model quality. Job Description Data acquisition & ingestion Own and lead engineering projects across the full data acquisition stack — web crawling, audio ingestion, source discovery, and dataset delivery to training pipelines. Build and operate large-scale distributed crawling infrastructure capable of continuously discovering and ingesting audio at scale across languages, accents, domains, and recording environments. Develop specialized crawlers for high-priority audio sources with source-specific extraction and normalization logic. Run experiments to evaluate crawling strategies, extraction methods, and ingestion tradeoffs; analyze results to identify gaps, redundancy, and coverage improvements across speaker demographics and language pairs. Build ingestion pipelines that scale reliably across large data campaigns, with automated audio quality filtering — SNR estimation, clipping detection, codec artifact identification — as a first-class pipeline stage. Systems & infrastructure Design and deploy highly scalable distributed systems capable of handling petabytes of audio data — from raw acquisition through quality filtering, deduplication, segmentation, and versioned dataset generation. Architect and implement indexing and search capabilities over large audio corpora — enabling fast lookup by language, speaker, acoustic condition, duration, and quality tier. Build and maintain backend services for data storage, including key-value databases, metadata synchronization, and manifest management across dataset versions. Deploy and operate acquisition infrastructure in a Kubernetes / Infrastructure-as-Code environment; perform routine system health checks and respond to production issues quickly. Collaborate closely with data processing, architecture, and ML platform teams to ensure smooth data flow from acquisition through to training-ready outputs. Compliance & data governance Work closely with legal to handle compliance, data privacy, and licensing matters across all acquisition sources — maintaining a clear audit trail of provenance, permitted use, and commercial training rights for every dataset. Enforce speaker consent documentation, GDPR requirements, robots.txt and ToS adherence, and audio retention policies across all ingestion pipelines. Manage relationships with third-party data vendors — writing precise acquisition briefs, evaluating quality on delivery, and ensuring sourced data meets Sanas's licensing and quality standards. Qualifications 4+ years of experience in data engineering, ML data infrastructure, or backend systems engineering — with direct experience building large-scale data ingestion or crawling systems. Strong Python and systems engineering skills — you build robust, maintainable infrastructure, not just one-off scripts. Hands-on experience with distributed systems design: you've built systems that handle failure gracefully, scale horizontally, and recover cleanly. Experience with web crawling infrastructure at scale including handling rate limiting, deduplication, and content extraction. Proficiency with cloud platforms (AWS or GCP), object storage (S3/GCS), and container orchestration (Kubernetes). Comfort working with audio processing tooling — ffmpeg, librosa, torchaudio, sox — and experience handling large volumes of audio files. Strong data quality instincts: you instrument pipelines, surface issues proactively, and treat data correctness with the same rigor as software correctness. Bonus Experience building speech or audio datasets for ASR, TTS, speech enhancement, or speaker verification model training. Familiarity with major open speech corpora — Common Voice, LibriSpeech, VoxPopuli, AISHELL — and their sourcing and quality characteristics. Experience with data versioning tools. Background in multilingual or low-resource language data collection. Experience with annotation and labeling platforms. Familiarity with speaker diarization, language identification, or automated audio quality estimation models used for data filtering at scale.

Full job record

Job ID149e01a361cf423bf3a92184109f3f00d34f69a4
Org ID83ad35d8-903f-4812-a8a1-7e0502248692
Source ID1fc1335f-581e-4138-ae2e-6e3d6c790876
Board ID1fc1335f-581e-4138-ae2e-6e3d6c790876
Providerrippling
Provider Job Keyad6ba237-eb74-4755-bf0d-027edbc3222c
TitleMember of Technical Staff (Data Acquisition)
Normalized Title
Statusdeleted
Activeno
Location TextPalo Alto, CA, United States
DepartmentScience
Team
Employment Typefull_time
Workplace Typeon_site
Remote Policy
CountryUnited States
RegionCA
CityPalo Alto
Salary Raw
Salary Min
Salary Max
Salary Currency
Salary Period
Source URLhttps://ats.rippling.com/sanas/jobs/ad6ba237-eb74-4755-bf0d-027edbc3222c
Apply URLhttps://ats.rippling.com/sanas/jobs/ad6ba237-eb74-4755-bf0d-027edbc3222c
First Seen At2026-05-29 07:10:25Z
Last Seen At2026-06-03 12:13:29Z
Last Checked At2026-06-06 08:42:22Z
Last Changed At2026-06-06 08:42:22Z
Inactive At2026-06-06 08:42:22Z
Source Posted At2026-04-06 19:23:06Z
Source Updated At
Raw Payload Uris3://bluework-jobs-prod-raw-590183727216/raw/provider=rippling/board=sanas/date=2026-06-03/2026-06-03T12-13-28-930Z-08e539e86adb9488f9d27b6e63e5dbd4b4861b3ceef6a0aae802617719a2970f.json
Event Fields
{
  "content_hash": "80a471111351fd9af31a0803249f7e3e928429642edb97d0fbb4e7bebe8f2cfc",
  "source_hash": "637a50c16d1eeba8d2ea3b5d36a89bdaa62f51dc998dfa6cc3a95c2df40194e3",
  "last_changed_at": "2026-06-06T08:42:22.197Z",
  "active_status": "deleted"
}
Parsed Structured
{
  "language": "en-us",
  "location": {
    "raw": "Palo Alto, CA, United States",
    "city": "Palo Alto",
    "region": "CA",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.98,
    "workplace_type": "on_site"
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-03T12:13:29.525Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en-us",
    "location": {
      "raw": "Palo Alto, CA, United States",
      "city": "Palo Alto",
      "region": "CA",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.98,
      "workplace_type": "on_site"
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": null,
  "salary_period": null,
  "workplace_type": "on_site",
  "salary_currency": null
}
Extensions
{}
Native Structured
{
  "list_job": {
    "id": "ad6ba237-eb74-4755-bf0d-027edbc3222c",
    "url": "https://ats.rippling.com/sanas/jobs/ad6ba237-eb74-4755-bf0d-027edbc3222c",
    "name": "Member of Technical Staff (Data Acquisition)",
    "language": "en-US",
    "locations": [
      {
        "city": "Palo Alto",
        "name": "Palo Alto, CA",
        "state": "California",
        "country": "United States",
        "stateCode": "CA",
        "countryCode": "US",
        "workplaceType": "ON_SITE"
      }
    ],
    "department": {
      "name": "Science"
    }
  },
  "detail_job": {
    "url": "https://ats.rippling.com/sanas/jobs/ad6ba237-eb74-4755-bf0d-027edbc3222c",
    "name": "Member of Technical Staff (Data Acquisition)",
    "uuid": "ad6ba237-eb74-4755-bf0d-027edbc3222c",
    "board": {
      "logo": {
        "url": "https://secured-assets.ripplingcdn.com/us1/ats/6862bfbc77c6a4c5f95ae521/ats/eac70887e52a4dcfa07336c69c248ce0?Expires=1780575209&Signature=WcAJjKq8w9zZoa6reRZCkbauobhsCX1CDhcZLP-KceptzvkvhKO0O9jC7CUkm9j-TzL74ikXQP5wcQED5udVaL4B-66xxBKyU-wEiPl-rxcIyuUUriKinneFkLXFcsRHXzZ0eu7hrAcMWK~M9Yy4Dqx4zjbGgtS98NmN1gpsH~mV0AUsnBekzozQez1PQZ8-iIZb0l~UoiRut7CvLi8b2Zts6EdoIVoArYS0uxeQo~RaU5d02R0dIFm8QeL3jiIfIFY5SrdEFdVcCNIAqAoG-wO5QnIVzxk7YBjJOEEPP70pOQwyG4IasUImBBR6eDKfn7reRXbgTJJylzGgwbyc4Q__&Key-Pair-Id=K2SM3GXN9F9XGM",
        "name": "Sanas-Logo-Full-RGB-Black (1).png",
        "type": "image/png"
      },
      "slug": "sanas",
      "title": "Sanas",
      "banner": {
        "url": null,
        "name": "",
        "type": ""
      },
      "boardURL": "https://ats.rippling.com/sanas/jobs",
      "fontType": null,
      "subtitle": null,
      "boardType": "RIPPLING",
      "linkColor": null,
      "buttonColor": null,
      "legalNotice": null,
      "buttonTextColor": null,
      "noOpeningsMessage": null,
      "groupJobsByLocation": false,
      "showBoardLogoOnJobPost": true,
      "showCompanyInfoUnderJobPost": false
    },
    "createdOn": "2026-04-06T12:23:06.525000-07:00",
    "department": {
      "name": "Science",
      "base_department": "Science",
      "department_tree": [
        "Science"
      ]
    },
    "companyName": "Sanas.AI Inc",
    "description": {
      "role": "<meta><h2 style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;line-height:1.6;font-size:15pt;font-weight:600;letter-spacing:0.5px;margin-top:18px;margin-bottom:4px;padding-left:0px;\"><b><strong style=\"font-size:15pt;white-space:pre-wrap;\">About Sanas</strong></b></h2><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><span style=\"white-space:pre-wrap;\">Sanas is pioneering the future of human communication. Founded by a team of Stanford researchers and entrepreneurs with deep industry experience, Sanas has developed the world's first real-time speech AI platform capable of accent translation, noise cancellation, speech enhancement, cross-language communication, and more.</span></p><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><span style=\"white-space:pre-wrap;\">Sanas makes conversations clearer, more inclusive, and more effective, removing barriers that prevent people from being understood, regardless of accent, background noise, or native language.</span></p><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><span style=\"white-space:pre-wrap;\">Sanas is currently one of the fastest growing startups in Silicon Valley, growing from $16M to $50M ARR in 2025. The company's core business is profitable and is on track to end 2026 with &gt;$120M ARR. Our team combines deep expertise in model innovation and systems engineering with a design-minded product engineering culture to build and ship cutting-edge AI models and experiences — entirely in-house.</span></p><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><span style=\"white-space:pre-wrap;\">Sanas is a 180-strong team, established in 2020. In this short span, we've successfully secured over $100 million in funding. Our innovation has been supported by the industry's leading investors, including Insight Partners, Google Ventures, Quadrille Capital, General Catalyst, Quiet Capital, and other influential investors. Our reputation is further solidified by collaborations with numerous Fortune 100 companies. With Sanas, you're not just adopting a product; you're investing in the future of communication.</span></p><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><span style=\"white-space:pre-wrap;\">If you’re looking to have a significant role in roadmapping and driving technical directions, if you’re looking to deploy challenging and big ideas without much overhead or slowness, if you're looking to leave your mark on an ambitious, generational mission to change how the worlds thinks about speech + AI, then Sanas is a well-suited place for you.</span></p><h2 style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;line-height:1.6;font-size:15pt;font-weight:600;letter-spacing:0.5px;margin-top:18px;margin-bottom:4px;padding-left:0px;\"><b><strong style=\"font-size:15pt;white-space:pre-wrap;\">About the Role</strong></b></h2><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><span style=\"white-space:pre-wrap;\">Your mission is to build and operate the ingestion systems that turn the open web and large-scale audio sources into reliable, well-structured corpora for training Sanas's frontier speech models. You'll own the machinery that acquires, extracts, filters, versions, and delivers audio data to our training pipelines — and you'll work directly with our research scientists to close the loop between what we collect and how it moves model quality.</span></p><h2 style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;line-height:1.6;font-size:15pt;font-weight:600;letter-spacing:0.5px;margin-top:18px;margin-bottom:4px;padding-left:0px;\"><b><strong style=\"font-size:15pt;white-space:pre-wrap;\">Job Description</strong></b></h2><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><b><strong style=\"white-space:pre-wrap;\">Data acquisition &amp; ingestion</strong></b></p><ul data-pattern=\"discCircleSquare\" data-depth=\"1\" style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;margin:8px 0px;line-height:1.6;padding:0px 0px 0px 32px;list-style-type:disc;\"><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Own and lead engineering projects across the full data acquisition stack — web crawling, audio ingestion, source discovery, and dataset delivery to training pipelines.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Build and operate large-scale distributed crawling infrastructure capable of continuously discovering and ingesting audio at scale across languages, accents, domains, and recording environments.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Develop specialized crawlers for high-priority audio sources with source-specific extraction and normalization logic.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Run experiments to evaluate crawling strategies, extraction methods, and ingestion tradeoffs; analyze results to identify gaps, redundancy, and coverage improvements across speaker demographics and language pairs.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Build ingestion pipelines that scale reliably across large data campaigns, with automated audio quality filtering — SNR estimation, clipping detection, codec artifact identification — as a first-class pipeline stage.</span></li></ul><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><b><strong style=\"white-space:pre-wrap;\">Systems &amp; infrastructure</strong></b></p><ul data-pattern=\"discCircleSquare\" data-depth=\"1\" style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;margin:8px 0px;line-height:1.6;padding:0px 0px 0px 32px;list-style-type:disc;\"><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Design and deploy highly scalable distributed systems capable of handling petabytes of audio data — from raw acquisition through quality filtering, deduplication, segmentation, and versioned dataset generation.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Architect and implement indexing and search capabilities over large audio corpora — enabling fast lookup by language, speaker, acoustic condition, duration, and quality tier.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Build and maintain backend services for data storage, including key-value databases, metadata synchronization, and manifest management across dataset versions.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Deploy and operate acquisition infrastructure in a Kubernetes / Infrastructure-as-Code environment; perform routine system health checks and respond to production issues quickly.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Collaborate closely with data processing, architecture, and ML platform teams to ensure smooth data flow from acquisition through to training-ready outputs.</span></li></ul><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><b><strong style=\"white-space:pre-wrap;\">Compliance &amp; data governance</strong></b></p><ul data-pattern=\"discCircleSquare\" data-depth=\"1\" style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;margin:8px 0px;line-height:1.6;padding:0px 0px 0px 32px;list-style-type:disc;\"><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Work closely with legal to handle compliance, data privacy, and licensing matters across all acquisition sources — maintaining a clear audit trail of provenance, permitted use, and commercial training rights for every dataset.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Enforce speaker consent documentation, GDPR requirements, robots.txt and ToS adherence, and audio retention policies across all ingestion pipelines.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Manage relationships with third-party data vendors — writing precise acquisition briefs, evaluating quality on delivery, and ensuring sourced data meets Sanas's licensing and quality standards.</span></li></ul><h2 style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;line-height:1.6;font-size:15pt;font-weight:600;letter-spacing:0.5px;margin-top:18px;margin-bottom:4px;padding-left:0px;\"><b><strong style=\"font-size:15pt;white-space:pre-wrap;\">Qualifications</strong></b></h2><ul data-pattern=\"discCircleSquare\" data-depth=\"1\" style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;margin:8px 0px;line-height:1.6;padding:0px 0px 0px 32px;list-style-type:disc;\"><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">4+ years of experience in data engineering, ML data infrastructure, or backend systems engineering — with direct experience building large-scale data ingestion or crawling systems.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Strong Python and systems engineering skills — you build robust, maintainable infrastructure, not just one-off scripts.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Hands-on experience with distributed systems design: you've built systems that handle failure gracefully, scale horizontally, and recover cleanly.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Experience with web crawling infrastructure at scale including handling rate limiting, deduplication, and content extraction.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Proficiency with cloud platforms (AWS or GCP), object storage (S3/GCS), and container orchestration (Kubernetes).</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Comfort working with audio processing tooling — ffmpeg, librosa, torchaudio, sox — and experience handling large volumes of audio files.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Strong data quality instincts: you instrument pipelines, surface issues proactively, and treat data correctness with the same rigor as software correctness.</span></li></ul><h2 style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;line-height:1.6;font-size:15pt;font-weight:600;letter-spacing:0.5px;margin-top:18px;margin-bottom:4px;padding-left:0px;\"><b><strong style=\"font-size:15pt;white-space:pre-wrap;\">Bonus</strong></b></h2><ul data-pattern=\"discCircleSquare\" data-depth=\"1\" style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;margin:8px 0px;line-height:1.6;padding:0px 0px 0px 32px;list-style-type:disc;\"><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Experience building speech or audio datasets for ASR, TTS, speech enhancement, or speaker verification model training.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Familiarity with major open speech corpora — Common Voice, LibriSpeech, VoxPopuli, AISHELL — and their sourcing and quality characteristics.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Experience with data versioning tools.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Background in multilingual or low-resource language data collection.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Experience with annotation and labeling platforms.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Familiarity with speaker diarization, language identification, or automated audio quality estimation models used for data filtering at scale.</span></li></ul>",
      "company": "<meta><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;text-align:start;\"><span style=\"white-space:pre-wrap;\">Sanas is pioneering the future of human communication. Founded by a team of Stanford researchers and entrepreneurs with deep industry experience, Sanas has developed the world's first real-time speech AI platform capable of accent translation, noise cancellation, speech enhancement, cross-language communication, and more.</span><br><br><span style=\"white-space:pre-wrap;\">Sanas makes conversations clearer, more inclusive, and more effective, removing barriers that prevent people from being understood, regardless of accent, background noise, or native language.</span><br><br><span style=\"white-space:pre-wrap;\">Sanas is currently one of the fastest growing startups in Silicon Valley, growing from $16M to $50M ARR in 2025. The company's core business is profitable and is on track to end 2026 with &gt;$120M ARR. Our team combines deep expertise in model innovation and systems engineering with a design-minded product engineering culture to build and ship cutting-edge AI models and experiences — entirely in-house.</span><br><br><span style=\"white-space:pre-wrap;\">Sanas is a 180-strong team, established in 2020. In this short span, we've successfully secured over $100 million in funding. Our innovation has been supported by the industry's leading investors, including Insight Partners, Google Ventures, Quadrille Capital, General Catalyst, Quiet Capital, and other influential investors. Our reputation is further solidified by collaborations with numerous Fortune 100 companies. With Sanas, you're not just adopting a product; you're investing in the future of communication.</span><br><br><span style=\"white-space:pre-wrap;\">If you’re looking to have a significant role in roadmapping and driving technical directions, if you’re looking to deploy challenging and big ideas without much overhead or slowness, if you're looking to leave your mark on an ambitious, generational mission to change how the worlds thinks about speech + AI, then Sanas is a well-suited place for you.</span></p>"
    },
    "workLocations": [
      "Palo Alto, CA"
    ],
    "employmentType": {
      "id": "Salaried, full-time",
      "label": "SALARIED_FT"
    },
    "payRangeDetails": [],
    "activeJobApplication": {
      "basicQuestions": [
        {
          "oid": "first_name",
          "title": "First name",
          "required": true,
          "fieldType": "SHORT_ANSWER"
        },
        {
          "oid": "last_name",
          "title": "Last name",
          "required": true,
          "fieldType": "SHORT_ANSWER"
        },
        {
          "oid": "email",
          "title": "Email",
          "required": true,
          "fieldType": "SHORT_ANSWER"
        },
        {
          "oid": "pronouns",
          "title": "Pronouns",
          "required": false,
          "fieldType": "PRONOUN"
        },
        {
          "oid": "current_company",
          "title": "Current company",
          "required": false,
          "fieldType": "SHORT_ANSWER"
        },
        {
          "oid": "phone_number",
          "title": "Phone number",
          "required": true,
          "fieldType": "PHONE_NUMBER"
        },
        {
          "oid": "location",
          "title": "Location (city only)",
          "required": true,
          "fieldType": "SHORT_ANSWER"
        },
        {
          "oid": "linkedin_link",
          "title": "LinkedIn link",
          "required": false,
          "fieldType": "SHORT_ANSWER"
        },
        {
          "oid": "resume",
          "title": "Resume",
          "required": true,
          "fieldType": "FILE"
        },
        {
          "oid": "cover_letter",
          "title": "Cover letter",
          "required": false,
          "fieldType": "FILE"
        }
      ],
      "customQuestions": {
        "fields": [
          {
            "oid": "first_name",
            "title": "First name",
            "required": true,
            "fieldData": {},
            "fieldType": "SHORT_ANSWER"
          },
          {
            "oid": "last_name",
            "title": "Last name",
            "required": true,
            "fieldData": {},
            "fieldType": "SHORT_ANSWER"
          },
          {
            "oid": "email",
            "title": "Email",
            "required": true,
            "fieldData": {},
            "fieldType": "SHORT_ANSWER"
          },
          {
            "oid": "pronouns",
            "title": "Pronouns",
            "required": false,
            "fieldData": {},
            "fieldType": "PRONOUN"
          },
          {
            "oid": "current_company",
            "title": "Current company",
            "required": false,
            "fieldData": {},
            "fieldType": "SHORT_ANSWER"
          },
          {
            "oid": "phone_number",
            "title": "Phone number",
            "required": true,
            "fieldData": {},
            "fieldType": "PHONE_NUMBER"
          },
          {
            "oid": "location",
            "title": "Location (city only)",
            "required": true,
            "fieldData": {},
            "fieldType": "SHORT_ANSWER"
          },
          {
            "oid": "linkedin_link",
            "title": "LinkedIn link",
            "required": false,
            "fieldData": {},
            "fieldType": "SHORT_ANSWER"
          },
          {
            "oid": "resume",
            "title": "Resume",
            "required": true,
            "fieldData": {},
            "fieldType": "FILE"
          },
          {
            "oid": "cover_letter",
            "title": "Cover letter",
            "required": false,
            "fieldData": {},
            "fieldType": "FILE"
          }
        ]
      },
      "additionalQuestions": null
    },
    "hasAIEvaluationsEnabled": false,
    "eeocQuestionnaireEnabled": true,
    "applicationConfirmationTemplate": "68c1a1ae94b69622be8d48ff",
    "eeocQuestionnaireEnabledForJobPost": true
  },
  "detail_meta": {
    "url": "https://ats.rippling.com/api/v2/board/sanas/jobs/ad6ba237-eb74-4755-bf0d-027edbc3222c",
    "http_status": 200,
    "content_type": "application/json",
    "response_bytes": 20039
  },
  "detail_errors": []
}
Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/149e01a361cf423bf3a92184109f3f00d34f69a4?include=descriptionJSON
GET https://api.bluedoor.sh/job-postings/v1/orgs/83ad35d8-903f-4812-a8a1-7e0502248692JSON
GET https://api.bluedoor.sh/job-postings/v1/sources/1fc1335f-581e-4138-ae2e-6e3d6c790876JSON
GET https://api.bluedoor.sh/job-postings/v1/jobs/149e01a361cf423bf3a92184109f3f00d34f69a4/eventsJSON