Home › Companies › Krea › Engineer, Supercomputing & Distributed Systems
Engineer, Supercomputing & Distributed Systems
Krea · San Francisco · On Site · Active · Ashby
Job facts
| Field | Value |
|---|---|
| Company | Krea |
| Title | Engineer, Supercomputing & Distributed Systems |
| Normalized title | - |
| Department / team | Engineering / Engineering |
| Location | San Francisco, CA, United States |
| Work model | On Site |
| Employment type | Full Time |
| Salary | - |
| Status | active |
| ATS provider | Ashby |
| Posted / first seen | — / 2026-05-29 |
| Changed / last seen | 2026-05-29 / 2026-06-06 |
Related slices
| Page | What it contains | Open |
|---|---|---|
| Company jobs | Active postings from Krea. | Open |
| Company breakdowns | Role, location, ATS, and work model facets for this company. | Open |
| ATS provider jobs | Active postings observed through Ashby. | Open |
| Provider filtered search | The same provider as a filtered job collection. | Open |
| City jobs | Active postings in San Francisco. | Open |
| Department jobs | Active postings in Engineering. | Open |
| Work model jobs | Active On Site postings. | Open |
| Lifecycle events | Open, update, close, and reopen events for this posting. | Open |
| Original posting | Canonical source or apply URL captured from the ATS. | Open |
Linked records
| Company | Krea |
| Source | 6a2d95c1-71fc-45c5-a54a-c6ab1c783f67 |
| ATS provider | Ashby |
Description
About Krea
At Krea, we are building next-generation AI creative tools.
We are dedicated to making AI intuitive and controllable for creatives. Our mission is to build tools that empower human creativity, not replace it.
We believe AI is a new medium that allows us to express ourselves through various formats—text, images, video, sound, and even 3D. We're building better, smarter, and more controllable tools to harness this medium.
Supercomputing / AI Infra at Krea
We build and operate the infrastructure for Krea's research and inference. Distributed training, 1000+ K8s GPU clusters, petabyte scale data pipelines, etc. We build a lot of this from scratch — custom distributed datastores, job orchestration systems, and streaming pipelines that replace tools like Kafka and Ray for modern AI workloads at scale.
Example projects:
Distributed data systems
Design multi-stage pipelines that turn petabytes of raw data into clean, annotated datasets
Run classification models on billions of images
Deploy and combine LLMs to caption massive multimedia data
GPU infrastructure
Manage distributed training and inference on 1000+ GPU Kubernetes clusters
Solve orchestration and scaling for large-scale GPU job processing
Scale workloads and research between clusters in multiple datacenters
Distributed training
Profile and optimize dataloaders streaming thousands of images per second
Profile and debug InfiniBand networking on huge training runs
Build fault tolerance systems for large-scale pretraining
Collaborate with researchers on evolving RL infrastructure
Applied ML pipelines
Find clean scenes in millions of videos using distributed shot-boundary detection
Customize and train models to filter billions of images for questions like "is this a screenshot?"
Build the systems that bridge raw cluster capacity and research output
Who we're looking for:
Systems people. If you've read a blog post about InfiniBand debugging or building a custom distributed database and thought "I want to do that" — this is that team.
You'll spend your time working heavily with Python, Kubernetes, Torch, and data tools like DuckDB, Arrow, etc. It's OK if you don't have K8s or ML experience — the main thing we hire for is an intuition for distributed systems, and a great mental model of how systems interact and function under different conditions.
Strong candidates may have experience with…
Python, PyArrow, DuckDB, SQL, massive relational databases, PyTorch, Pandas, NumPy…
Kubernetes
Designing and implementing large-scale ETL systems
Fundamental knowledge of containerization, operating systems, file-systems, and networking
Distributed systems design
Distributed training systems (NCCL, InfiniBand, RDMA)
Streaming and event processing systems (Kafka, Pulsar, or similar)
PyTorch internals, custom dataloaders, and training infrastructure
Full job record
| Job ID | 30fd11c7277fa4fad7ae32fe543fe110f72e2379 |
| Org ID | 24c54dc4-b6c5-41db-90a3-7bcef02a3333 |
| Source ID | 6a2d95c1-71fc-45c5-a54a-c6ab1c783f67 |
| Board ID | 6a2d95c1-71fc-45c5-a54a-c6ab1c783f67 |
| Provider | ashby |
| Provider Job Key | ebe94024-eef6-4306-a019-10072ad0f4c9 |
| Title | Engineer, Supercomputing & Distributed Systems |
| Normalized Title | — |
| Status | active |
| Active | yes |
| Location Text | San Francisco |
| Department | Engineering |
| Team | Engineering |
| Employment Type | full_time |
| Workplace Type | on_site |
| Remote Policy | — |
| Country | United States |
| Region | CA |
| City | San Francisco |
| Salary Raw | — |
| Salary Min | — |
| Salary Max | — |
| Salary Currency | — |
| Salary Period | — |
| Source URL | https://jobs.ashbyhq.com/krea/ebe94024-eef6-4306-a019-10072ad0f4c9 |
| Apply URL | https://jobs.ashbyhq.com/krea/ebe94024-eef6-4306-a019-10072ad0f4c9/application |
| First Seen At | 2026-05-29 05:49:34Z |
| Last Seen At | 2026-06-06 20:37:53Z |
| Last Checked At | 2026-06-06 20:37:53Z |
| Last Changed At | 2026-05-29 05:49:34Z |
| Inactive At | — |
| Source Posted At | — |
| Source Updated At | — |
| Raw Payload Uri | s3://job-postings-prod-raw-590183727216/raw/provider=ashby/board=krea/date=2026-06-06/2026-06-06T20-37-51-606Z-4861fb8026c5f4375b95755ad5a1e992b57fd73f713b072d6ae1d14928789581.json |
Event Fields
{
"content_hash": "334881b5852d8d4fbb79d9c5140996a950202eb8d842bc62f192ca5d286d4375",
"source_hash": "abddc1085801643a9274dd65eba5218e7b65ebbbb76ac777156aed704da92965",
"last_changed_at": "2026-05-29T05:49:34.085Z",
"active_status": "active"
}Parsed Structured
{
"language": "en",
"location": {
"raw": "San Francisco",
"city": "San Francisco",
"region": "CA",
"country": "United States",
"is_remote": false,
"confidence": 0.75
},
"salary_max": null,
"salary_min": null,
"inferred_at": "2026-06-06T20:37:53.135Z",
"launch_scope": {
"reason": "english_us_canada",
"included": true,
"language": "en",
"location": {
"raw": "San Francisco",
"city": "San Francisco",
"region": "CA",
"country": "United States",
"is_remote": false,
"confidence": 0.75
},
"countries": [
"United States"
]
},
"remote_policy": null,
"salary_period": null,
"workplace_type": "on_site",
"salary_currency": null
}Extensions
{}Native Structured
{
"id": "ebe94024-eef6-4306-a019-10072ad0f4c9",
"team": "Engineering",
"title": "Engineer, Supercomputing & Distributed Systems",
"jobUrl": "https://jobs.ashbyhq.com/krea/ebe94024-eef6-4306-a019-10072ad0f4c9",
"address": null,
"applyUrl": "https://jobs.ashbyhq.com/krea/ebe94024-eef6-4306-a019-10072ad0f4c9/application",
"isListed": true,
"isRemote": false,
"location": "San Francisco",
"updatedAt": null,
"apiVersion": "ashby-non-user-graphql-v1",
"department": "Engineering",
"publishedAt": null,
"workplaceType": "OnSite",
"employmentType": "FullTime",
"secondaryLocations": []
}Get this page with API
Rendered from the bluedoor Job Postings API. Reproduce it:
GET https://api.bluedoor.sh/job-postings/v1/jobs/30fd11c7277fa4fad7ae32fe543fe110f72e2379?include=descriptionJSONGET https://api.bluedoor.sh/job-postings/v1/orgs/24c54dc4-b6c5-41db-90a3-7bcef02a3333JSONGET https://api.bluedoor.sh/job-postings/v1/sources/6a2d95c1-71fc-45c5-a54a-c6ab1c783f67JSONGET https://api.bluedoor.sh/job-postings/v1/jobs/30fd11c7277fa4fad7ae32fe543fe110f72e2379/eventsJSON