Home › Companies › Sanas.AI Inc › Member of Technical Staff (Data Acquisition)

Member of Technical Staff (Data Acquisition)

Sanas.AI Inc · Palo Alto, CA, United States · On Site · Deleted · Rippling ATS

Job facts

Field	Value
Company	Sanas.AI Inc
Title	Member of Technical Staff (Data Acquisition)
Normalized title	-
Department / team	Science
Location	Palo Alto, CA, United States
Work model	On Site
Employment type	Full Time
Salary	-
Status	deleted
ATS provider	Rippling ATS
Posted / first seen	2026-04-06 / 2026-05-29
Changed / last seen	2026-06-06 / 2026-06-03

Related slices

Page	What it contains	Open
Company jobs	Active postings from Sanas.AI Inc.	Open
Company breakdowns	Role, location, ATS, and work model facets for this company.	Open
ATS provider jobs	Active postings observed through Rippling ATS.	Open
Provider filtered search	The same provider as a filtered job collection.	Open
City jobs	Active postings in Palo Alto.	Open
Department jobs	Active postings in Science.	Open
Work model jobs	Active On Site postings.	Open
Lifecycle events	Open, update, close, and reopen events for this posting.	Open
Original posting	Canonical source or apply URL captured from the ATS.	Open

Linked records

Company	Sanas.AI Inc
Source	1fc1335f-581e-4138-ae2e-6e3d6c790876
ATS provider	Rippling ATS

Description

company Sanas is pioneering the future of human communication. Founded by a team of Stanford researchers and entrepreneurs with deep industry experience, Sanas has developed the world's first real-time speech AI platform capable of accent translation, noise cancellation, speech enhancement, cross-language communication, and more. Sanas makes conversations clearer, more inclusive, and more effective, removing barriers that prevent people from being understood, regardless of accent, background noise, or native language. Sanas is currently one of the fastest growing startups in Silicon Valley, growing from $16M to $50M ARR in 2025. The company's core business is profitable and is on track to end 2026 with >$120M ARR. Our team combines deep expertise in model innovation and systems engineering with a design-minded product engineering culture to build and ship cutting-edge AI models and experiences — entirely in-house. Sanas is a 180-strong team, established in 2020. In this short span, we've successfully secured over $100 million in funding. Our innovation has been supported by the industry's leading investors, including Insight Partners, Google Ventures, Quadrille Capital, General Catalyst, Quiet Capital, and other influential investors. Our reputation is further solidified by collaborations with numerous Fortune 100 companies. With Sanas, you're not just adopting a product; you're investing in the future of communication. If you’re looking to have a significant role in roadmapping and driving technical directions, if you’re looking to deploy challenging and big ideas without much overhead or slowness, if you're looking to leave your mark on an ambitious, generational mission to change how the worlds thinks about speech + AI, then Sanas is a well-suited place for you. role About Sanas Sanas is pioneering the future of human communication. Founded by a team of Stanford researchers and entrepreneurs with deep industry experience, Sanas has developed the world's first real-time speech AI platform capable of accent translation, noise cancellation, speech enhancement, cross-language communication, and more. Sanas makes conversations clearer, more inclusive, and more effective, removing barriers that prevent people from being understood, regardless of accent, background noise, or native language. Sanas is currently one of the fastest growing startups in Silicon Valley, growing from $16M to $50M ARR in 2025. The company's core business is profitable and is on track to end 2026 with >$120M ARR. Our team combines deep expertise in model innovation and systems engineering with a design-minded product engineering culture to build and ship cutting-edge AI models and experiences — entirely in-house. Sanas is a 180-strong team, established in 2020. In this short span, we've successfully secured over $100 million in funding. Our innovation has been supported by the industry's leading investors, including Insight Partners, Google Ventures, Quadrille Capital, General Catalyst, Quiet Capital, and other influential investors. Our reputation is further solidified by collaborations with numerous Fortune 100 companies. With Sanas, you're not just adopting a product; you're investing in the future of communication. If you’re looking to have a significant role in roadmapping and driving technical directions, if you’re looking to deploy challenging and big ideas without much overhead or slowness, if you're looking to leave your mark on an ambitious, generational mission to change how the worlds thinks about speech + AI, then Sanas is a well-suited place for you. About the Role Your mission is to build and operate the ingestion systems that turn the open web and large-scale audio sources into reliable, well-structured corpora for training Sanas's frontier speech models. You'll own the machinery that acquires, extracts, filters, versions, and delivers audio data to our training pipelines — and you'll work directly with our research scientists to close the loop between what we collect and how it moves model quality. Job Description Data acquisition & ingestion Own and lead engineering projects across the full data acquisition stack — web crawling, audio ingestion, source discovery, and dataset delivery to training pipelines. Build and operate large-scale distributed crawling infrastructure capable of continuously discovering and ingesting audio at scale across languages, accents, domains, and recording environments. Develop specialized crawlers for high-priority audio sources with source-specific extraction and normalization logic. Run experiments to evaluate crawling strategies, extraction methods, and ingestion tradeoffs; analyze results to identify gaps, redundancy, and coverage improvements across speaker demographics and language pairs. Build ingestion pipelines that scale reliably across large data campaigns, with automated audio quality filtering — SNR estimation, clipping detection, codec artifact identification — as a first-class pipeline stage. Systems & infrastructure Design and deploy highly scalable distributed systems capable of handling petabytes of audio data — from raw acquisition through quality filtering, deduplication, segmentation, and versioned dataset generation. Architect and implement indexing and search capabilities over large audio corpora — enabling fast lookup by language, speaker, acoustic condition, duration, and quality tier. Build and maintain backend services for data storage, including key-value databases, metadata synchronization, and manifest management across dataset versions. Deploy and operate acquisition infrastructure in a Kubernetes / Infrastructure-as-Code environment; perform routine system health checks and respond to production issues quickly. Collaborate closely with data processing, architecture, and ML platform teams to ensure smooth data flow from acquisition through to training-ready outputs. Compliance & data governance Work closely with legal to handle compliance, data privacy, and licensing matters across all acquisition sources — maintaining a clear audit trail of provenance, permitted use, and commercial training rights for every dataset. Enforce speaker consent documentation, GDPR requirements, robots.txt and ToS adherence, and audio retention policies across all ingestion pipelines. Manage relationships with third-party data vendors — writing precise acquisition briefs, evaluating quality on delivery, and ensuring sourced data meets Sanas's licensing and quality standards. Qualifications 4+ years of experience in data engineering, ML data infrastructure, or backend systems engineering — with direct experience building large-scale data ingestion or crawling systems. Strong Python and systems engineering skills — you build robust, maintainable infrastructure, not just one-off scripts. Hands-on experience with distributed systems design: you've built systems that handle failure gracefully, scale horizontally, and recover cleanly. Experience with web crawling infrastructure at scale including handling rate limiting, deduplication, and content extraction. Proficiency with cloud platforms (AWS or GCP), object storage (S3/GCS), and container orchestration (Kubernetes). Comfort working with audio processing tooling — ffmpeg, librosa, torchaudio, sox — and experience handling large volumes of audio files. Strong data quality instincts: you instrument pipelines, surface issues proactively, and treat data correctness with the same rigor as software correctness. Bonus Experience building speech or audio datasets for ASR, TTS, speech enhancement, or speaker verification model training. Familiarity with major open speech corpora — Common Voice, LibriSpeech, VoxPopuli, AISHELL — and their sourcing and quality characteristics. Experience with data versioning tools. Background in multilingual or low-resource language data collection. Experience with annotation and labeling platforms. Familiarity with speaker diarization, language identification, or automated audio quality estimation models used for data filtering at scale.

Full job record

Job ID	149e01a361cf423bf3a92184109f3f00d34f69a4
Org ID	83ad35d8-903f-4812-a8a1-7e0502248692
Source ID	1fc1335f-581e-4138-ae2e-6e3d6c790876
Board ID	1fc1335f-581e-4138-ae2e-6e3d6c790876
Provider	rippling
Provider Job Key	ad6ba237-eb74-4755-bf0d-027edbc3222c
Title	Member of Technical Staff (Data Acquisition)
Normalized Title	—
Status	deleted
Active	no
Location Text	Palo Alto, CA, United States
Department	Science
Team	—
Employment Type	full_time
Workplace Type	on_site
Remote Policy	—
Country	United States
Region	CA
City	Palo Alto
Salary Raw	—
Salary Min	—
Salary Max	—
Salary Currency	—
Salary Period	—
Source URL	https://ats.rippling.com/sanas/jobs/ad6ba237-eb74-4755-bf0d-027edbc3222c
Apply URL	https://ats.rippling.com/sanas/jobs/ad6ba237-eb74-4755-bf0d-027edbc3222c
First Seen At	2026-05-29 07:10:25Z
Last Seen At	2026-06-03 12:13:29Z
Last Checked At	2026-06-06 08:42:22Z
Last Changed At	2026-06-06 08:42:22Z
Inactive At	2026-06-06 08:42:22Z
Source Posted At	2026-04-06 19:23:06Z
Source Updated At	—
Raw Payload Uri	s3://bluework-jobs-prod-raw-590183727216/raw/provider=rippling/board=sanas/date=2026-06-03/2026-06-03T12-13-28-930Z-08e539e86adb9488f9d27b6e63e5dbd4b4861b3ceef6a0aae802617719a2970f.json

Event Fields

{
  "content_hash": "80a471111351fd9af31a0803249f7e3e928429642edb97d0fbb4e7bebe8f2cfc",
  "source_hash": "637a50c16d1eeba8d2ea3b5d36a89bdaa62f51dc998dfa6cc3a95c2df40194e3",
  "last_changed_at": "2026-06-06T08:42:22.197Z",
  "active_status": "deleted"
}

Parsed Structured

{
  "language": "en-us",
  "location": {
    "raw": "Palo Alto, CA, United States",
    "city": "Palo Alto",
    "region": "CA",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.98,
    "workplace_type": "on_site"
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-03T12:13:29.525Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en-us",
    "location": {
      "raw": "Palo Alto, CA, United States",
      "city": "Palo Alto",
      "region": "CA",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.98,
      "workplace_type": "on_site"
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": null,
  "salary_period": null,
  "workplace_type": "on_site",
  "salary_currency": null
}

Extensions

{}

Native Structured

{
  "list_job": {
    "id": "ad6ba237-eb74-4755-bf0d-027edbc3222c",
    "url": "https://ats.rippling.com/sanas/jobs/ad6ba237-eb74-4755-bf0d-027edbc3222c",
    "name": "Member of Technical Staff (Data Acquisition)",
    "language": "en-US",
    "locations": [
      {
        "city": "Palo Alto",
        "name": "Palo Alto, CA",
        "state": "California",
        "country": "United States",
        "stateCode": "CA",
        "countryCode": "US",
        "workplaceType": "ON_SITE"
      }
    ],
    "department": {
      "name": "Science"
    }
  },
  "detail_job": {
    "url": "https://ats.rippling.com/sanas/jobs/ad6ba237-eb74-4755-bf0d-027edbc3222c",
    "name": "Member of Technical Staff (Data Acquisition)",
    "uuid": "ad6ba237-eb74-4755-bf0d-027edbc3222c",
    "board": {
      "logo": {
        "url": "https://secured-assets.ripplingcdn.com/us1/ats/6862bfbc77c6a4c5f95ae521/ats/eac70887e52a4dcfa07336c69c248ce0?Expires=1780575209&Signature=WcAJjKq8w9zZoa6reRZCkbauobhsCX1CDhcZLP-KceptzvkvhKO0O9jC7CUkm9j-TzL74ikXQP5wcQED5udVaL4B-66xxBKyU-wEiPl-rxcIyuUUriKinneFkLXFcsRHXzZ0eu7hrAcMWK~M9Yy4Dqx4zjbGgtS98NmN1gpsH~mV0AUsnBekzozQez1PQZ8-iIZb0l~UoiRut7CvLi8b2Zts6EdoIVoArYS0uxeQo~RaU5d02R0dIFm8QeL3jiIfIFY5SrdEFdVcCNIAqAoG-wO5QnIVzxk7YBjJOEEPP70pOQwyG4IasUImBBR6eDKfn7reRXbgTJJylzGgwbyc4Q__&Key-Pair-Id=K2SM3GXN9F9XGM",
        "name": "Sanas-Logo-Full-RGB-Black (1).png",
        "type": "image/png"
      },
      "slug": "sanas",
      "title": "Sanas",
      "banner": {
        "url": null,
        "name": "",
        "type": ""
      },
      "boardURL": "https://ats.rippling.com/sanas/jobs",
      "fontType": null,
      "subtitle": null,
      "boardType": "RIPPLING",
      "linkColor": null,
      "buttonColor": null,
      "legalNotice": null,
      "buttonTextColor": null,
      "noOpeningsMessage": null,
      "groupJobsByLocation": false,
      "showBoardLogoOnJobPost": true,
      "showCompanyInfoUnderJobPost": false
    },
    "createdOn": "2026-04-06T12:23:06.525000-07:00",
    "department": {
      "name": "Science",
      "base_department": "Science",
      "department_tree": [
        "Science"
      ]
    },
    "companyName": "Sanas.AI Inc",
    "description": {
      "role": "<meta><h2 style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;line-height:1.6;font-size:15pt;font-weight:600;letter-spacing:0.5px;margin-top:18px;margin-bottom:4px;padding-left:0px;\"><b><strong style=\"font-size:15pt;white-space:pre-wrap;\">About Sanas</strong></b></h2><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><span style=\"white-space:pre-wrap;\">Sanas is pioneering the future of human communication. Founded by a team of Stanford researchers and entrepreneurs with deep industry experience, Sanas has developed the world's first real-time speech AI platform capable of accent translation, noise cancellation, speech enhancement, cross-language communication, and more.</span></p><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><span style=\"white-space:pre-wrap;\">Sanas makes conversations clearer, more inclusive, and more effective, removing barriers that prevent people from being understood, regardless of accent, background noise, or native language.</span></p><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><span style=\"white-space:pre-wrap;\">Sanas is currently one of the fastest growing startups in Silicon Valley, growing from $16M to $50M ARR in 2025. The company's core business is profitable and is on track to end 2026 with &gt;$120M ARR. Our team combines deep expertise in model innovation and systems engineering with a design-minded product engineering culture to build and ship cutting-edge AI models and experiences — entirely in-house.</span></p><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><span style=\"white-space:pre-wrap;\">Sanas is a 180-strong team, established in 2020. In this short span, we've successfully secured over $100 million in funding. Our innovation has been supported by the industry's leading investors, including Insight Partners, Google Ventures, Quadrille Capital, General Catalyst, Quiet Capital, and other influential investors. Our reputation is further solidified by collaborations with numerous Fortune 100 companies. With Sanas, you're not just adopting a product; you're investing in the future of communication.</span></p><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><span style=\"white-space:pre-wrap;\">If you’re looking to have a significant role in roadmapping and driving technical directions, if you’re looking to deploy challenging and big ideas without much overhead or slowness, if you're looking to leave your mark on an ambitious, generational mission to change how the worlds thinks about speech + AI, then Sanas is a well-suited place for you.</span></p><h2 style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;line-height:1.6;font-size:15pt;font-weight:600;letter-spacing:0.5px;margin-top:18px;margin-bottom:4px;padding-left:0px;\"><b><strong style=\"font-size:15pt;white-space:pre-wrap;\">About the Role</strong></b></h2><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><span style=\"white-space:pre-wrap;\">Your mission is to build and operate the ingestion systems that turn the open web and large-scale audio sources into reliable, well-structured corpora for training Sanas's frontier speech models. You'll own the machinery that acquires, extracts, filters, versions, and delivers audio data to our training pipelines — and you'll work directly with our research scientists to close the loop between what we collect and how it moves model quality.</span></p><h2 style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;line-height:1.6;font-size:15pt;font-weight:600;letter-spacing:0.5px;margin-top:18px;margin-bottom:4px;padding-left:0px;\"><b><strong style=\"font-size:15pt;white-space:pre-wrap;\">Job Description</strong></b></h2><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><b><strong style=\"white-space:pre-wrap;\">Data acquisition &amp; ingestion</strong></b></p><ul data-pattern=\"discCircleSquare\" data-depth=\"1\" style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;margin:8px 0px;line-height:1.6;padding:0px 0px 0px 32px;list-style-type:disc;\"><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Own and lead engineering projects across the full data acquisition stack — web crawling, audio ingestion, source discovery, and dataset delivery to training pipelines.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Build and operate large-scale distributed crawling infrastructure capable of continuously discovering and ingesting audio at scale across languages, accents, domains, and recording environments.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Develop specialized crawlers for high-priority audio sources with source-specific extraction and normalization logic.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Run experiments to evaluate crawling strategies, extraction methods, and ingestion tradeoffs; analyze results to identify gaps, redundancy, and coverage improvements across speaker demographics and language pairs.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Build ingestion pipelines that scale reliably across large data campaigns, with automated audio quality filtering — SNR estimation, clipping detection, codec artifact identification — as a first-class pipeline stage.</span></li></ul><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><b><strong style=\"white-space:pre-wrap;\">Systems &amp; infrastructure</strong></b></p><ul data-pattern=\"discCircleSquare\" data-depth=\"1\" style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;margin:8px 0px;line-height:1.6;padding:0px 0px 0px 32px;list-style-type:disc;\"><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Design and deploy highly scalable distributed systems capable of handling petabytes of audio data — from raw acquisition through quality filtering, deduplication, segmentation, and versioned dataset generation.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Architect and implement indexing and search capabilities over large audio corpora — enabling fast lookup by language, speaker, acoustic condition, duration, and quality tier.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Build and maintain backend services for data storage, including key-value databases, metadata synchronization, and manifest management across dataset versions.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Deploy and operate acquisition infrastructure in a Kubernetes / Infrastructure-as-Code environment; perform routine system health checks and respond to production issues quickly.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Collaborate closely with data processing, architecture, and ML platform teams to ensure smooth data flow from acquisition through to training-ready outputs.</span></li></ul><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;\"><b><strong style=\"white-space:pre-wrap;\">Compliance &amp; data governance</strong></b></p><ul data-pattern=\"discCircleSquare\" data-depth=\"1\" style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;margin:8px 0px;line-height:1.6;padding:0px 0px 0px 32px;list-style-type:disc;\"><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Work closely with legal to handle compliance, data privacy, and licensing matters across all acquisition sources — maintaining a clear audit trail of provenance, permitted use, and commercial training rights for every dataset.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Enforce speaker consent documentation, GDPR requirements, robots.txt and ToS adherence, and audio retention policies across all ingestion pipelines.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Manage relationships with third-party data vendors — writing precise acquisition briefs, evaluating quality on delivery, and ensuring sourced data meets Sanas's licensing and quality standards.</span></li></ul><h2 style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;line-height:1.6;font-size:15pt;font-weight:600;letter-spacing:0.5px;margin-top:18px;margin-bottom:4px;padding-left:0px;\"><b><strong style=\"font-size:15pt;white-space:pre-wrap;\">Qualifications</strong></b></h2><ul data-pattern=\"discCircleSquare\" data-depth=\"1\" style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;margin:8px 0px;line-height:1.6;padding:0px 0px 0px 32px;list-style-type:disc;\"><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">4+ years of experience in data engineering, ML data infrastructure, or backend systems engineering — with direct experience building large-scale data ingestion or crawling systems.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Strong Python and systems engineering skills — you build robust, maintainable infrastructure, not just one-off scripts.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Hands-on experience with distributed systems design: you've built systems that handle failure gracefully, scale horizontally, and recover cleanly.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Experience with web crawling infrastructure at scale including handling rate limiting, deduplication, and content extraction.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Proficiency with cloud platforms (AWS or GCP), object storage (S3/GCS), and container orchestration (Kubernetes).</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Comfort working with audio processing tooling — ffmpeg, librosa, torchaudio, sox — and experience handling large volumes of audio files.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Strong data quality instincts: you instrument pipelines, surface issues proactively, and treat data correctness with the same rigor as software correctness.</span></li></ul><h2 style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;line-height:1.6;font-size:15pt;font-weight:600;letter-spacing:0.5px;margin-top:18px;margin-bottom:4px;padding-left:0px;\"><b><strong style=\"font-size:15pt;white-space:pre-wrap;\">Bonus</strong></b></h2><ul data-pattern=\"discCircleSquare\" data-depth=\"1\" style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;margin:8px 0px;line-height:1.6;padding:0px 0px 0px 32px;list-style-type:disc;\"><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Experience building speech or audio datasets for ASR, TTS, speech enhancement, or speaker verification model training.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Familiarity with major open speech corpora — Common Voice, LibriSpeech, VoxPopuli, AISHELL — and their sourcing and quality characteristics.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Experience with data versioning tools.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Background in multilingual or low-resource language data collection.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Experience with annotation and labeling platforms.</span></li><li style=\"font-size:11pt;margin:3px 0px;letter-spacing:0.25px;line-height:1.6;\"><span style=\"white-space:pre-wrap;\">Familiarity with speaker diarization, language identification, or automated audio quality estimation models used for data filtering at scale.</span></li></ul>",
      "company": "<meta><p style=\"font-family:&quot;Basel Grotesk&quot;,Arial,sans-serif;font-size:11pt;font-weight:400;line-height:1.6;letter-spacing:0.25px;margin:4px 0px;padding:0px;text-align:start;\"><span style=\"white-space:pre-wrap;\">Sanas is pioneering the future of human communication. Founded by a team of Stanford researchers and entrepreneurs with deep industry experience, Sanas has developed the world's first real-time speech AI platform capable of accent translation, noise cancellation, speech enhancement, cross-language communication, and more.</span><br><br><span style=\"white-space:pre-wrap;\">Sanas makes conversations clearer, more inclusive, and more effective, removing barriers that prevent people from being understood, regardless of accent, background noise, or native language.</span><br><br><span style=\"white-space:pre-wrap;\">Sanas is currently one of the fastest growing startups in Silicon Valley, growing from $16M to $50M ARR in 2025. The company's core business is profitable and is on track to end 2026 with &gt;$120M ARR. Our team combines deep expertise in model innovation and systems engineering with a design-minded product engineering culture to build and ship cutting-edge AI models and experiences — entirely in-house.</span><br><br><span style=\"white-space:pre-wrap;\">Sanas is a 180-strong team, established in 2020. In this short span, we've successfully secured over $100 million in funding. Our innovation has been supported by the industry's leading investors, including Insight Partners, Google Ventures, Quadrille Capital, General Catalyst, Quiet Capital, and other influential investors. Our reputation is further solidified by collaborations with numerous Fortune 100 companies. With Sanas, you're not just adopting a product; you're investing in the future of communication.</span><br><br><span style=\"white-space:pre-wrap;\">If you’re looking to have a significant role in roadmapping and driving technical directions, if you’re looking to deploy challenging and big ideas without much overhead or slowness, if you're looking to leave your mark on an ambitious, generational mission to change how the worlds thinks about speech + AI, then Sanas is a well-suited place for you.</span></p>"
    },
    "workLocations": [
      "Palo Alto, CA"
    ],
    "employmentType": {
      "id": "Salaried, full-time",
      "label": "SALARIED_FT"
    },
    "payRangeDetails": [],
    "activeJobApplication": {
      "basicQuestions": [
        {
          "oid": "first_name",
          "title": "First name",
          "required": true,
          "fieldType": "SHORT_ANSWER"
        },
        {
          "oid": "last_name",
          "title": "Last name",
          "required": true,
          "fieldType": "SHORT_ANSWER"
        },
        {
          "oid": "email",
          "title": "Email",
          "required": true,
          "fieldType": "SHORT_ANSWER"
        },
        {
          "oid": "pronouns",
          "title": "Pronouns",
          "required": false,
          "fieldType": "PRONOUN"
        },
        {
          "oid": "current_company",
          "title": "Current company",
          "required": false,
          "fieldType": "SHORT_ANSWER"
        },
        {
          "oid": "phone_number",
          "title": "Phone number",
          "required": true,
          "fieldType": "PHONE_NUMBER"
        },
        {
          "oid": "location",
          "title": "Location (city only)",
          "required": true,
          "fieldType": "SHORT_ANSWER"
        },
        {
          "oid": "linkedin_link",
          "title": "LinkedIn link",
          "required": false,
          "fieldType": "SHORT_ANSWER"
        },
        {
          "oid": "resume",
          "title": "Resume",
          "required": true,
          "fieldType": "FILE"
        },
        {
          "oid": "cover_letter",
          "title": "Cover letter",
          "required": false,
          "fieldType": "FILE"
        }
      ],
      "customQuestions": {
        "fields": [
          {
            "oid": "first_name",
            "title": "First name",
            "required": true,
            "fieldData": {},
            "fieldType": "SHORT_ANSWER"
          },
          {
            "oid": "last_name",
            "title": "Last name",
            "required": true,
            "fieldData": {},
            "fieldType": "SHORT_ANSWER"
          },
          {
            "oid": "email",
            "title": "Email",
            "required": true,
            "fieldData": {},
            "fieldType": "SHORT_ANSWER"
          },
          {
            "oid": "pronouns",
            "title": "Pronouns",
            "required": false,
            "fieldData": {},
            "fieldType": "PRONOUN"
          },
          {
            "oid": "current_company",
            "title": "Current company",
            "required": false,
            "fieldData": {},
            "fieldType": "SHORT_ANSWER"
          },
          {
            "oid": "phone_number",
            "title": "Phone number",
            "required": true,
            "fieldData": {},
            "fieldType": "PHONE_NUMBER"
          },
          {
            "oid": "location",
            "title": "Location (city only)",
            "required": true,
            "fieldData": {},
            "fieldType": "SHORT_ANSWER"
          },
          {
            "oid": "linkedin_link",
            "title": "LinkedIn link",
            "required": false,
            "fieldData": {},
            "fieldType": "SHORT_ANSWER"
          },
          {
            "oid": "resume",
            "title": "Resume",
            "required": true,
            "fieldData": {},
            "fieldType": "FILE"
          },
          {
            "oid": "cover_letter",
            "title": "Cover letter",
            "required": false,
            "fieldData": {},
            "fieldType": "FILE"
          }
        ]
      },
      "additionalQuestions": null
    },
    "hasAIEvaluationsEnabled": false,
    "eeocQuestionnaireEnabled": true,
    "applicationConfirmationTemplate": "68c1a1ae94b69622be8d48ff",
    "eeocQuestionnaireEnabledForJobPost": true
  },
  "detail_meta": {
    "url": "https://ats.rippling.com/api/v2/board/sanas/jobs/ad6ba237-eb74-4755-bf0d-027edbc3222c",
    "http_status": 200,
    "content_type": "application/json",
    "response_bytes": 20039
  },
  "detail_errors": []
}

Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/149e01a361cf423bf3a92184109f3f00d34f69a4?include=descriptionJSON

GET https://api.bluedoor.sh/job-postings/v1/orgs/83ad35d8-903f-4812-a8a1-7e0502248692JSON

GET https://api.bluedoor.sh/job-postings/v1/sources/1fc1335f-581e-4138-ae2e-6e3d6c790876JSON

GET https://api.bluedoor.sh/job-postings/v1/jobs/149e01a361cf423bf3a92184109f3f00d34f69a4/eventsJSON

Docs · Get an API key