Home › Companies › Genmo › GPU Performance Engineer
GPU Performance Engineer
Genmo · San Francisco HQ · Active · Ashby
Job facts
| Field | Value |
|---|---|
| Company | Genmo |
| Title | GPU Performance Engineer |
| Normalized title | - |
| Department / team | Engineering / Engineering |
| Location | San Francisco, CA, United States |
| Work model | - |
| Employment type | Full Time |
| Salary | - |
| Status | active |
| ATS provider | Ashby |
| Posted / first seen | — / 2026-05-29 |
| Changed / last seen | 2026-05-29 / 2026-06-06 |
Related slices
| Page | What it contains | Open |
|---|---|---|
| Company jobs | Active postings from Genmo. | Open |
| Company breakdowns | Role, location, ATS, and work model facets for this company. | Open |
| ATS provider jobs | Active postings observed through Ashby. | Open |
| Provider filtered search | The same provider as a filtered job collection. | Open |
| City jobs | Active postings in San Francisco. | Open |
| Department jobs | Active postings in Engineering. | Open |
| Lifecycle events | Open, update, close, and reopen events for this posting. | Open |
| Original posting | Canonical source or apply URL captured from the ATS. | Open |
Linked records
| Company | Genmo |
| Source | 95f01483-808c-427c-b63f-d8f0de3f65ee |
| ATS provider | Ashby |
Description
We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI. Join us in shaping the future of AI and pushing the boundaries of what's possible in video generation.
We're seeking a GPU Performance Engineer to squeeze every last FLOP from our H100 infrastructure and optimize our model serving stack to its absolute limits.
The Role
You'll be our performance optimization expert, using advanced profiling tools to identify bottlenecks and implementing solutions that achieve 5-10x speedups. From writing custom CUDA kernels to eliminating cold start latency, you'll ensure our infrastructure delivers world-class performance. This role is perfect for someone who gets excited about microsecond optimizations and pushing hardware to its theoretical limits.
Key Responsibilities
Profile and optimize GPU workloads using Nsight Systems, nvprof, and custom instrumentation
Write high-performance CUDA and Triton kernels for critical model operations
Optimize cold start latency from seconds to milliseconds for our serving infrastructure
Tune memory access patterns, kernel fusion, and GPU utilization
Collaborate with ML engineers to optimize model implementations
Debug performance issues across the full stack from application to hardware
Implement custom memory pooling and allocation strategies
Share optimization techniques and build performance culture across teams
Qualifications
Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field
5+ years systems programming experience with 3+ years focused on GPU optimization
Expert proficiency with GPU profiling tools (Nsight Systems, nvprof)
Strong CUDA programming skills with production kernel development
Deep understanding of GPU architecture (memory hierarchy, SMs, warps)
Track record of achieving significant performance improvements (5-10x)
Experience with Python and C++ in production environments
We Value
Experience with Triton kernel development
Knowledge of CUTLASS or similar high-performance libraries
Background in ML-specific optimizations (attention, transformers)
RDMA/InfiniBand optimization experience
Contributions to GPU libraries or frameworks
Low-level debugging skills (PTX/SASS reading)
Genmo is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law. Genmo, Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish .
Full job record
| Job ID | f85b5306fc5e5921b1e75362f7fff5b9cd354c2e |
| Org ID | 0b5883cc-6283-4b7e-96e3-13f845f36c2f |
| Source ID | 95f01483-808c-427c-b63f-d8f0de3f65ee |
| Board ID | 95f01483-808c-427c-b63f-d8f0de3f65ee |
| Provider | ashby |
| Provider Job Key | 14aaaa71-7c0c-4352-9c6c-bf3f12f3aeb5 |
| Title | GPU Performance Engineer |
| Normalized Title | — |
| Status | active |
| Active | yes |
| Location Text | San Francisco HQ |
| Department | Engineering |
| Team | Engineering |
| Employment Type | full_time |
| Workplace Type | — |
| Remote Policy | — |
| Country | United States |
| Region | CA |
| City | San Francisco |
| Salary Raw | — |
| Salary Min | — |
| Salary Max | — |
| Salary Currency | — |
| Salary Period | — |
| Source URL | https://jobs.ashbyhq.com/genmo/14aaaa71-7c0c-4352-9c6c-bf3f12f3aeb5 |
| Apply URL | https://jobs.ashbyhq.com/genmo/14aaaa71-7c0c-4352-9c6c-bf3f12f3aeb5/application |
| First Seen At | 2026-05-29 06:26:05Z |
| Last Seen At | 2026-06-06 09:18:57Z |
| Last Checked At | 2026-06-06 09:18:57Z |
| Last Changed At | 2026-05-29 06:26:05Z |
| Inactive At | — |
| Source Posted At | — |
| Source Updated At | — |
| Raw Payload Uri | s3://job-postings-prod-raw-590183727216/raw/provider=ashby/board=genmo/date=2026-06-06/2026-06-06T09-18-53-629Z-f1fffa425dcce07d478ecab1829269be0535db0f4fe095e2bf75f6de5c6bc50c.json |
Event Fields
{
"content_hash": "6946a3254113a79d8c149ed74caa2c48be65bd2d9e3acf9814297169e69f00e7",
"source_hash": "41b66c75d75d947fe2e2c5d8657c4c2b945d95849b627bccf6241d0fec055f50",
"last_changed_at": "2026-05-29T06:26:05.003Z",
"active_status": "active"
}Parsed Structured
{
"language": "en",
"location": {
"raw": "San Francisco HQ",
"city": "San Francisco",
"region": "CA",
"country": "United States",
"is_remote": false,
"confidence": 0.75
},
"salary_max": null,
"salary_min": null,
"inferred_at": "2026-06-06T09:18:57.188Z",
"launch_scope": {
"reason": "english_us_canada",
"included": true,
"language": "en",
"location": {
"raw": "San Francisco HQ",
"city": "San Francisco",
"region": "CA",
"country": "United States",
"is_remote": false,
"confidence": 0.75
},
"countries": [
"United States"
]
},
"remote_policy": null,
"salary_period": null,
"workplace_type": null,
"salary_currency": null
}Extensions
{}Native Structured
{
"id": "14aaaa71-7c0c-4352-9c6c-bf3f12f3aeb5",
"team": "Engineering",
"title": "GPU Performance Engineer",
"jobUrl": "https://jobs.ashbyhq.com/genmo/14aaaa71-7c0c-4352-9c6c-bf3f12f3aeb5",
"address": null,
"applyUrl": "https://jobs.ashbyhq.com/genmo/14aaaa71-7c0c-4352-9c6c-bf3f12f3aeb5/application",
"isListed": true,
"isRemote": false,
"location": "San Francisco HQ",
"updatedAt": null,
"apiVersion": "ashby-non-user-graphql-v1",
"department": "Engineering",
"publishedAt": null,
"workplaceType": null,
"employmentType": "FullTime",
"secondaryLocations": []
}Get this page with API
Rendered from the bluedoor Job Postings API. Reproduce it:
GET https://api.bluedoor.sh/job-postings/v1/jobs/f85b5306fc5e5921b1e75362f7fff5b9cd354c2e?include=descriptionJSONGET https://api.bluedoor.sh/job-postings/v1/orgs/0b5883cc-6283-4b7e-96e3-13f845f36c2fJSONGET https://api.bluedoor.sh/job-postings/v1/sources/95f01483-808c-427c-b63f-d8f0de3f65eeJSONGET https://api.bluedoor.sh/job-postings/v1/jobs/f85b5306fc5e5921b1e75362f7fff5b9cd354c2e/eventsJSON