Home › Companies › Hyphen Connect › Synthetic Data Engineer (AI Data/Training)
Synthetic Data Engineer (AI Data/Training)
Hyphen Connect · Boston, USA · Active · Greenhouse
Job facts
| Field | Value |
|---|---|
| Company | Hyphen Connect |
| Title | Synthetic Data Engineer (AI Data/Training) |
| Normalized title | - |
| Department / team | Engineering |
| Location | United States |
| Work model | - |
| Employment type | - |
| Salary | - |
| Status | active |
| ATS provider | Greenhouse |
| Posted / first seen | 2026-04-24 / 2026-05-29 |
| Changed / last seen | 2026-05-29 / 2026-06-06 |
Related slices
| Page | What it contains | Open |
|---|---|---|
| Company jobs | Active postings from Hyphen Connect. | Open |
| Company breakdowns | Role, location, ATS, and work model facets for this company. | Open |
| ATS provider jobs | Active postings observed through Greenhouse. | Open |
| Provider filtered search | The same provider as a filtered job collection. | Open |
| Department jobs | Active postings in Engineering. | Open |
| Lifecycle events | Open, update, close, and reopen events for this posting. | Open |
| Original posting | Canonical source or apply URL captured from the ATS. | Open |
Linked records
| Company | Hyphen Connect |
| Source | 8c23f81b-aec0-450e-b33a-ce033b97ca6f |
| ATS provider | Greenhouse |
Description
We are seeking a talented and innovative Synthetic Data Engineer. In this role, you will design and implement domain-specific synthetic data generation pipelines, ensuring high-quality data management for training loops. Your expertise will drive the success of data processing and model training within the organization.
Responsibilities:
Design domain-specific synthetic data generation (SDG) pipelines via self-instruct and constitutional prompting.
Implement automated quality scoring and de-duplication systems.
Manage data pipelines that feed directly into SFT and DPO training loops.
Qualifications:
Proven experience building large-scale data pipelines (Airflow, Spark, Ray).
Deep knowledge of prompt engineering for data generation.
Familiarity with dataset distillation and bias mitigation.
Full job record
| Job ID | fa90629a303e58ba907ce25dbbef272edde57286 |
| Org ID | dc3160c9-fdba-42a9-af23-1bf5a6168ee9 |
| Source ID | 8c23f81b-aec0-450e-b33a-ce033b97ca6f |
| Board ID | 8c23f81b-aec0-450e-b33a-ce033b97ca6f |
| Provider | greenhouse |
| Provider Job Key | 5120020007 |
| Title | Synthetic Data Engineer (AI Data/Training) |
| Normalized Title | — |
| Status | active |
| Active | yes |
| Location Text | Boston, USA |
| Department | Engineering |
| Team | — |
| Employment Type | — |
| Workplace Type | — |
| Remote Policy | — |
| Country | United States |
| Region | — |
| City | — |
| Salary Raw | — |
| Salary Min | — |
| Salary Max | — |
| Salary Currency | — |
| Salary Period | — |
| Source URL | https://job-boards.greenhouse.io/hyphenconnect/jobs/5120020007 |
| Apply URL | https://job-boards.greenhouse.io/hyphenconnect/jobs/5120020007 |
| First Seen At | 2026-05-29 22:42:08Z |
| Last Seen At | 2026-06-06 07:33:56Z |
| Last Checked At | 2026-06-06 07:33:56Z |
| Last Changed At | 2026-05-29 22:42:08Z |
| Inactive At | — |
| Source Posted At | 2026-04-24 14:19:14Z |
| Source Updated At | 2026-04-24 14:19:14Z |
| Raw Payload Uri | s3://job-postings-prod-raw-590183727216/raw/provider=greenhouse/board=hyphenconnect/date=2026-06-06/2026-06-06T07-33-55-771Z-f9564d516ac0e0c5cc4262a402d228a84dc202fa42babe8802338f1a9f0261b7.json |
Event Fields
{
"content_hash": "41615fc35b0c63b6ee326608eb352974aa5844654b92bc751c45d106799a9c58",
"source_hash": "78dfa0bf89a386de372a2323b757acef49cdece559bff657d2781556b0bf4004",
"last_changed_at": "2026-05-29T22:42:08.378Z",
"active_status": "active"
}Parsed Structured
{
"language": "en",
"location": {
"raw": "Boston, USA",
"city": null,
"region": null,
"country": "United States",
"is_remote": false,
"confidence": 0.95
},
"salary_max": null,
"salary_min": null,
"inferred_at": "2026-06-06T07:33:56.357Z",
"launch_scope": {
"reason": "english_us_canada",
"included": true,
"language": "en",
"location": {
"raw": "Boston, USA",
"city": null,
"region": null,
"country": "United States",
"is_remote": false,
"confidence": 0.95
},
"countries": [
"United States"
]
},
"remote_policy": null,
"salary_period": null,
"workplace_type": null,
"salary_currency": null
}Extensions
{}Native Structured
{
"title": "Synthetic Data Engineer (AI Data/Training)",
"offices": [
{
"id": 4038286007,
"name": " United States",
"location": " United States",
"child_ids": [],
"parent_id": null
}
],
"language": "en",
"location": {
"name": "Boston, USA"
},
"metadata": [],
"updated_at": "2026-04-24T10:19:14-04:00",
"departments": [
{
"id": 4021619007,
"name": "Engineering",
"child_ids": [],
"parent_id": null
}
],
"company_name": "Hyphen Connect Limited",
"requisition_id": 4623874007,
"first_published": "2026-04-24T10:19:14-04:00",
"application_deadline": null
}Get this page with API
Rendered from the bluedoor Job Postings API. Reproduce it:
GET https://api.bluedoor.sh/job-postings/v1/jobs/fa90629a303e58ba907ce25dbbef272edde57286?include=descriptionJSONGET https://api.bluedoor.sh/job-postings/v1/orgs/dc3160c9-fdba-42a9-af23-1bf5a6168ee9JSONGET https://api.bluedoor.sh/job-postings/v1/sources/8c23f81b-aec0-450e-b33a-ce033b97ca6fJSONGET https://api.bluedoor.sh/job-postings/v1/jobs/fa90629a303e58ba907ce25dbbef272edde57286/eventsJSON