Home › Companies › Exa › Software Engineer, Distributed Data Systems
Software Engineer, Distributed Data Systems
Exa · San Francisco, California · On Site · Active · Ashby
Job facts
| Field | Value |
|---|---|
| Company | Exa |
| Title | Software Engineer, Distributed Data Systems |
| Normalized title | - |
| Department / team | Engineering / Engineering |
| Location | San Francisco, CA, United States |
| Work model | On Site |
| Employment type | Full Time |
| Salary | - |
| Status | active |
| ATS provider | Ashby |
| Posted / first seen | — / 2026-05-29 |
| Changed / last seen | 2026-05-29 / 2026-06-06 |
Related slices
| Page | What it contains | Open |
|---|---|---|
| Company jobs | Active postings from Exa. | Open |
| Company breakdowns | Role, location, ATS, and work model facets for this company. | Open |
| ATS provider jobs | Active postings observed through Ashby. | Open |
| Provider filtered search | The same provider as a filtered job collection. | Open |
| City jobs | Active postings in San Francisco. | Open |
| Department jobs | Active postings in Engineering. | Open |
| Work model jobs | Active On Site postings. | Open |
| Lifecycle events | Open, update, close, and reopen events for this posting. | Open |
| Original posting | Canonical source or apply URL captured from the ATS. | Open |
Linked records
| Company | Exa |
| Source | a41a85b8-7172-4ac4-ab16-297b0b01a649 |
| ATS provider | Ashby |
Description
We raised a $250M Series C to build the search engine for AIs. Led by a16z, with existing investors Benchmark, Lightspeed, and YC doubling down, the round brings Exa's valuation to $2.2 billion. Read more
Exa is building a search engine from scratch to serve every AI agent. We build massive-scale infrastructure to crawl the web, train state-of-the-art embedding models to process it, and design super high performant vector databases in rust to search over it. If you like compute, we also own a $5M H200 GPU cluster (and soon 5x'ing that) and regularly spin up batchjobs with tens of thousands of machines.
As a Data Engineer, you'll architect and build the data infrastructure that powers everything we do—from crawling billions of pages to training our embedding models to serving real-time search. You'll have enormous autonomy in designing systems that scale to hundreds of petabytes. If you've ever wanted to build data pipelines at a scale that most companies only dream about, this is your chance.
Who You Are
Deep understanding of lakehouse architectures (Delta Lake, Iceberg, Hudi) and when to use them
Experience building and operating large-scale distributed data processing pipelines
Hands-on experience with streaming data systems (Kafka, Flink, or similar)
Familiarity with Ray, Spark, or ClickHouse at production scale
An obsessive focus on reliability and building systems that don't page you at 3am
Bonus
Experience with Lance or other vector-native storage formats
Background in GPU-accelerated data processing (RAPIDS, cuDF)
What You Could Do
Design a lakehouse architecture that handles 100+ PB of web crawl data
Build streaming pipelines that process billions of documents per day for real-time indexing
Architect the data layer for our embedding training infrastructure on Ray
Scale our ClickHouse deployment to handle analytical queries across petabytes of search logs
This is an in-person opportunity in San Francisco. We're happy to sponsor international candidates (e.g., STEM OPT, OPT, H1B, O1, E3). In addition to premium healthcare benefits (medical, dental, vision), we also offer fertility benefits and a monthly wellness stipend to all of our employees.
Full job record
| Job ID | a7f3993205d769af3a3f73cf5bbfdca407dfdd09 |
| Org ID | 978b0118-ecbb-43e5-8196-7caeacb44a42 |
| Source ID | a41a85b8-7172-4ac4-ab16-297b0b01a649 |
| Board ID | a41a85b8-7172-4ac4-ab16-297b0b01a649 |
| Provider | ashby |
| Provider Job Key | 5f10cc28-bcad-4ce4-8fa1-996a59c59e65 |
| Title | Software Engineer, Distributed Data Systems |
| Normalized Title | — |
| Status | active |
| Active | yes |
| Location Text | San Francisco, California |
| Department | Engineering |
| Team | Engineering |
| Employment Type | full_time |
| Workplace Type | on_site |
| Remote Policy | — |
| Country | United States |
| Region | CA |
| City | San Francisco |
| Salary Raw | — |
| Salary Min | — |
| Salary Max | — |
| Salary Currency | — |
| Salary Period | — |
| Source URL | https://jobs.ashbyhq.com/exa/5f10cc28-bcad-4ce4-8fa1-996a59c59e65 |
| Apply URL | https://jobs.ashbyhq.com/exa/5f10cc28-bcad-4ce4-8fa1-996a59c59e65/application |
| First Seen At | 2026-05-29 06:34:26Z |
| Last Seen At | 2026-06-06 09:26:16Z |
| Last Checked At | 2026-06-06 09:26:16Z |
| Last Changed At | 2026-05-29 06:34:26Z |
| Inactive At | — |
| Source Posted At | — |
| Source Updated At | — |
| Raw Payload Uri | s3://job-postings-prod-raw-590183727216/raw/provider=ashby/board=exa/date=2026-06-06/2026-06-06T09-25-53-505Z-709b71b576465835af1f2118ca5b64263512606c55097ae510550297cfea65a5.json |
Event Fields
{
"content_hash": "62a3326cc56ebc9e030a8ddde8b4a53fea3cdf19aa74010488addc84e2beefa4",
"source_hash": "94c82b9a6897fc92608635499aa42a9a8d130c3be1cdcaeff5b29041e850e4b0",
"last_changed_at": "2026-05-29T06:34:26.068Z",
"active_status": "active"
}Parsed Structured
{
"language": "en",
"location": {
"raw": "San Francisco, California",
"city": "San Francisco",
"region": "CA",
"country": "United States",
"is_remote": false,
"confidence": 0.85
},
"salary_max": null,
"salary_min": null,
"inferred_at": "2026-06-06T09:26:16.781Z",
"launch_scope": {
"reason": "english_us_canada",
"included": true,
"language": "en",
"location": {
"raw": "San Francisco, California",
"city": "San Francisco",
"region": "CA",
"country": "United States",
"is_remote": false,
"confidence": 0.85
},
"countries": [
"United States"
]
},
"remote_policy": null,
"salary_period": null,
"workplace_type": "on_site",
"salary_currency": null
}Extensions
{}Native Structured
{
"id": "5f10cc28-bcad-4ce4-8fa1-996a59c59e65",
"team": "Engineering",
"title": "Software Engineer, Distributed Data Systems",
"jobUrl": "https://jobs.ashbyhq.com/exa/5f10cc28-bcad-4ce4-8fa1-996a59c59e65",
"address": null,
"applyUrl": "https://jobs.ashbyhq.com/exa/5f10cc28-bcad-4ce4-8fa1-996a59c59e65/application",
"isListed": true,
"isRemote": false,
"location": "San Francisco, California",
"updatedAt": null,
"apiVersion": "ashby-non-user-graphql-v1",
"department": "Engineering",
"publishedAt": null,
"workplaceType": "OnSite",
"employmentType": "FullTime",
"secondaryLocations": []
}Get this page with API
Rendered from the bluedoor Job Postings API. Reproduce it:
GET https://api.bluedoor.sh/job-postings/v1/jobs/a7f3993205d769af3a3f73cf5bbfdca407dfdd09?include=descriptionJSONGET https://api.bluedoor.sh/job-postings/v1/orgs/978b0118-ecbb-43e5-8196-7caeacb44a42JSONGET https://api.bluedoor.sh/job-postings/v1/sources/a41a85b8-7172-4ac4-ab16-297b0b01a649JSONGET https://api.bluedoor.sh/job-postings/v1/jobs/a7f3993205d769af3a3f73cf5bbfdca407dfdd09/eventsJSON