Home › Companies › Hyphen Connect › LLM Pre-training & Distributed Engineer (AI Infrastructure)
LLM Pre-training & Distributed Engineer (AI Infrastructure)
Hyphen Connect · Boston, USA · Active · Greenhouse
Job facts
| Field | Value |
|---|---|
| Company | Hyphen Connect |
| Title | LLM Pre-training & Distributed Engineer (AI Infrastructure) |
| Normalized title | - |
| Department / team | Engineering |
| Location | United States |
| Work model | - |
| Employment type | - |
| Salary | - |
| Status | active |
| ATS provider | Greenhouse |
| Posted / first seen | 2026-04-24 / 2026-05-29 |
| Changed / last seen | 2026-05-29 / 2026-06-06 |
Related slices
| Page | What it contains | Open |
|---|---|---|
| Company jobs | Active postings from Hyphen Connect. | Open |
| Company breakdowns | Role, location, ATS, and work model facets for this company. | Open |
| ATS provider jobs | Active postings observed through Greenhouse. | Open |
| Provider filtered search | The same provider as a filtered job collection. | Open |
| Department jobs | Active postings in Engineering. | Open |
| Lifecycle events | Open, update, close, and reopen events for this posting. | Open |
| Original posting | Canonical source or apply URL captured from the ATS. | Open |
Linked records
| Company | Hyphen Connect |
| Source | 8c23f81b-aec0-450e-b33a-ce033b97ca6f |
| ATS provider | Greenhouse |
Description
We are seeking a highly skilled LLM Pre-training & Distributed Systems Engineer. This role is essential for orchestrating large-scale machine learning training runs and optimizing distributed infrastructure. The ideal candidate will have a deep understanding of GPU clusters and extensive experience in system engineering to ensure efficient and reliable training processes.
Responsibilities:
Orchestrate distributed training runs across 1,000+ GPUs using PyTorch, DeepSpeed, or Megatron-LM.
Optimize networking (InfiniBand/RDMA) and memory management to prevent out-of-memory errors.
Automate checkpointing and failure recovery during month-long training runs.
Required Skills:
Deep expertise in 3D parallelism (Data, Tensor, Pipeline).
Experience managing SLURM or Kubernetes-based GPU clusters.
Strong systems engineering background (C++, CUDA, Python).
Full job record
| Job ID | ed883a5903331c0634fc1743527eaf313cc043d8 |
| Org ID | dc3160c9-fdba-42a9-af23-1bf5a6168ee9 |
| Source ID | 8c23f81b-aec0-450e-b33a-ce033b97ca6f |
| Board ID | 8c23f81b-aec0-450e-b33a-ce033b97ca6f |
| Provider | greenhouse |
| Provider Job Key | 5119989007 |
| Title | LLM Pre-training & Distributed Engineer (AI Infrastructure) |
| Normalized Title | — |
| Status | active |
| Active | yes |
| Location Text | Boston, USA |
| Department | Engineering |
| Team | — |
| Employment Type | — |
| Workplace Type | — |
| Remote Policy | — |
| Country | United States |
| Region | — |
| City | — |
| Salary Raw | — |
| Salary Min | — |
| Salary Max | — |
| Salary Currency | — |
| Salary Period | — |
| Source URL | https://job-boards.greenhouse.io/hyphenconnect/jobs/5119989007 |
| Apply URL | https://job-boards.greenhouse.io/hyphenconnect/jobs/5119989007 |
| First Seen At | 2026-05-29 22:42:08Z |
| Last Seen At | 2026-06-06 07:33:56Z |
| Last Checked At | 2026-06-06 07:33:56Z |
| Last Changed At | 2026-05-29 22:42:08Z |
| Inactive At | — |
| Source Posted At | 2026-04-24 14:08:21Z |
| Source Updated At | 2026-04-24 14:08:21Z |
| Raw Payload Uri | s3://job-postings-prod-raw-590183727216/raw/provider=greenhouse/board=hyphenconnect/date=2026-06-06/2026-06-06T07-33-55-771Z-f9564d516ac0e0c5cc4262a402d228a84dc202fa42babe8802338f1a9f0261b7.json |
Event Fields
{
"content_hash": "deb4e4dabd427380bf9fd9a7b4a145822f217f9b34dbfa8c06d209215c274a5e",
"source_hash": "75170253d24fd262a616c43a990e379ceb0e451bcb7c592e24fe47cccdb4f12e",
"last_changed_at": "2026-05-29T22:42:08.378Z",
"active_status": "active"
}Parsed Structured
{
"language": "en",
"location": {
"raw": "Boston, USA",
"city": null,
"region": null,
"country": "United States",
"is_remote": false,
"confidence": 0.95
},
"salary_max": null,
"salary_min": null,
"inferred_at": "2026-06-06T07:33:56.284Z",
"launch_scope": {
"reason": "english_us_canada",
"included": true,
"language": "en",
"location": {
"raw": "Boston, USA",
"city": null,
"region": null,
"country": "United States",
"is_remote": false,
"confidence": 0.95
},
"countries": [
"United States"
]
},
"remote_policy": null,
"salary_period": null,
"workplace_type": null,
"salary_currency": null
}Extensions
{}Native Structured
{
"title": "LLM Pre-training & Distributed Engineer (AI Infrastructure)",
"offices": [
{
"id": 4038286007,
"name": " United States",
"location": " United States",
"child_ids": [],
"parent_id": null
}
],
"language": "en",
"location": {
"name": "Boston, USA"
},
"metadata": [],
"updated_at": "2026-04-24T10:08:21-04:00",
"departments": [
{
"id": 4021619007,
"name": "Engineering",
"child_ids": [],
"parent_id": null
}
],
"company_name": "Hyphen Connect Limited",
"requisition_id": 4623868007,
"first_published": "2026-04-24T10:08:21-04:00",
"application_deadline": null
}Get this page with API
Rendered from the bluedoor Job Postings API. Reproduce it:
GET https://api.bluedoor.sh/job-postings/v1/jobs/ed883a5903331c0634fc1743527eaf313cc043d8?include=descriptionJSONGET https://api.bluedoor.sh/job-postings/v1/orgs/dc3160c9-fdba-42a9-af23-1bf5a6168ee9JSONGET https://api.bluedoor.sh/job-postings/v1/sources/8c23f81b-aec0-450e-b33a-ce033b97ca6fJSONGET https://api.bluedoor.sh/job-postings/v1/jobs/ed883a5903331c0634fc1743527eaf313cc043d8/eventsJSON