Home › Companies › Wizard › Senior Machine Learning Engineer (Inference Platform)

Senior Machine Learning Engineer (Inference Platform)

Wizard · Remote - USA · Remote · Active · Greenhouse

Job facts

Field	Value
Company	Wizard
Title	Senior Machine Learning Engineer (Inference Platform)
Normalized title	-
Department / team	AI & Machine Learning
Location	United States
Work model	Remote / Remote
Employment type	-
Salary	-
Status	active
ATS provider	Greenhouse
Posted / first seen	2026-03-25 / 2026-05-29
Changed / last seen	2026-06-04 / 2026-06-06

Related slices

Page	What it contains	Open
Company jobs	Active postings from Wizard.	Open
Company breakdowns	Role, location, ATS, and work model facets for this company.	Open
ATS provider jobs	Active postings observed through Greenhouse.	Open
Provider filtered search	The same provider as a filtered job collection.	Open
Department jobs	Active postings in AI & Machine Learning.	Open
Work model jobs	Active Remote postings.	Open
Lifecycle events	Open, update, close, and reopen events for this posting.	Open
Original posting	Canonical source or apply URL captured from the ATS.	Open

Linked records

Company	Wizard
Source	f75e55fe-59b1-47f1-b6d0-e10af602c0bf
ATS provider	Greenhouse

Description

About Wizard AI At Wizard AI, we’re building the top-performing AI Shopping Agent that delivers the best products from across the web with unmatched accuracy, quality, and trust. Our ML models power the core of our platform, and we’re looking for a Senior Machine Learning Engineer to own how they run in production reliably, efficiently, and at scale. The Role As a Senior ML Engineer on our Inference Platform , you’ll own the end-to-end lifecycle of production ML serving systems from model packaging and deployment to monitoring, optimization, and scaling. This is not a traditional MLOps role focused solely on pipelines and tooling. You’ll be responsible for the inference infrastructure powering a live conversational shopping agent, operating multiple specialized serving engines under real-world production load. You’ll own critical decisions around serving architecture, performance, reliability, and scalability, working closely with ML Engineers, Data teams, Product, and DevOps to ensure models move seamlessly from experimentation into high-performance production systems. What You'll Do Own and evolve our multi-engine inference platform, supporting a variety of model types and serving requirements. Build and improve production ML pipelines — taking models from experimentation to reliable, high-throughput serving. Define and implement model versioning, rollout, rollback, and lifecycle management strategies that ensure reproducibility and operational reliability. Define and enforce serving-layer SLAs, including latency, availability, GPU utilization, Time-to-First-Token (TTFT), and Inter-Token Latency (ITL). Build observability, monitoring, alerting, and operational tooling for production inference systems. Apply software engineering best practices, including testing, CI/CD integration, and reproducibility across ML workflows. Optimize inference performance through efficient resource utilization, hardware-aware serving strategies, and cost-conscious infrastructure design. Ensure ML serving systems are secure, scalable, and operationally resilient. Partner with ML, Data, Product, and DevOps teams to turn ideas into production systems, driving the technical decisions on serving and scale. What We're Looking For Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related field, or equivalent practical experience. 5–8+ years of experience in Software Engineering, ML Engineering, Platform Engineering, or Infrastructure Engineering, with direct ownership of production ML serving systems. Hands-on experience running an LLM serving engine (vLLM, TGI, TensorRT-LLM, or SGLang) in production under real load — not just managed or hosted endpoints. Strong Python skills and software engineering fundamentals, combined with deep systems and infrastructure knowledge. Experience with cloud platforms such as AWS, GCP, or Azure, and familiarity with ML lifecycle tooling, experimentation platforms, and model registries. Strong grasp of inference performance — continuous batching, KV-cache and GPU-memory behavior, quantization, and CPU-versus-GPU bottlenecks — with the instinct to profile before tuning. Experience serving heterogeneous workloads, including LLMs, embedding models, and extraction models, each with distinct latency, throughput, and scaling requirements. Demonstrated ability to balance latency, throughput, reliability, and infrastructure cost while operating production-scale ML systems. Experience in high-growth startup environments and comfort operating in fast-moving, evolving technical landscapes. What Success Looks Like Reliable, Scalable Inference Systems Production serving infrastructure operates with clear SLAs, strong observability, and minimal downtime. Latency, availability, throughput, and GPU utilization are actively measured and optimized as platform demands grow. End-to-End Ownership You own the complete serving lifecycle — from deployment and release management through monitoring, optimization, and scaling — enabling ML engineers to ship quickly while maintaining reliability and reproducibility. Technical Leadership and Impact You shape the future of Wizard's inference platform, driving key architectural decisions that improve performance, reduce infrastructure costs, and support the next generation of AI-powered shopping experiences.

Full job record

Job ID	9d194732d3454c9b83768d3df0d0365c596a8b73
Org ID	a2329884-8c27-4643-9928-6b675096f8ae
Source ID	f75e55fe-59b1-47f1-b6d0-e10af602c0bf
Board ID	f75e55fe-59b1-47f1-b6d0-e10af602c0bf
Provider	greenhouse
Provider Job Key	5837279004
Title	Senior Machine Learning Engineer (Inference Platform)
Normalized Title	—
Status	active
Active	yes
Location Text	Remote - USA
Department	AI & Machine Learning
Team	—
Employment Type	—
Workplace Type	remote
Remote Policy	remote
Country	United States
Region	—
City	—
Salary Raw	—
Salary Min	—
Salary Max	—
Salary Currency	—
Salary Period	—
Source URL	https://job-boards.greenhouse.io/wizardcommerce/jobs/5837279004
Apply URL	https://job-boards.greenhouse.io/wizardcommerce/jobs/5837279004
First Seen At	2026-05-29 22:42:34Z
Last Seen At	2026-06-06 07:36:01Z
Last Checked At	2026-06-06 07:36:01Z
Last Changed At	2026-06-04 11:17:46Z
Inactive At	—
Source Posted At	2026-03-25 19:20:51Z
Source Updated At	2026-06-03 14:43:17Z
Raw Payload Uri	s3://job-postings-prod-raw-590183727216/raw/provider=greenhouse/board=wizardcommerce/date=2026-06-06/2026-06-06T07-36-01-410Z-431ef4b4ea051976cd84214b62b933f17b60e410535e181c1811206ec8654ac4.json

Event Fields

{
  "content_hash": "29aa0d7f56324ecbfbf6dd3249840faca33c8b062a38bc2d608524977dec9960",
  "source_hash": "b3201fd01f0d9f130383c409a000417a79f67fc9ec43fac5e4cf1e8829327d67",
  "last_changed_at": "2026-06-04T11:17:46.867Z",
  "active_status": "active"
}

Parsed Structured

{
  "language": "en",
  "location": {
    "raw": "Remote - USA",
    "city": null,
    "region": null,
    "country": "United States",
    "is_remote": true,
    "confidence": 0.95
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-06T07:36:01.502Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en",
    "location": {
      "raw": "Remote - USA",
      "city": null,
      "region": null,
      "country": "United States",
      "is_remote": true,
      "confidence": 0.95
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": "remote",
  "salary_period": null,
  "workplace_type": "remote",
  "salary_currency": null
}

Extensions

{}

Native Structured

{
  "title": "Senior Machine Learning Engineer (Inference Platform)",
  "offices": [
    {
      "id": 4026552004,
      "name": "Remote",
      "location": "Remote",
      "child_ids": [],
      "parent_id": null
    }
  ],
  "language": "en",
  "location": {
    "name": "Remote - USA"
  },
  "metadata": [],
  "updated_at": "2026-06-03T10:43:17-04:00",
  "departments": [
    {
      "id": 4043640004,
      "name": "AI & Machine Learning",
      "child_ids": [],
      "parent_id": 4043638004
    }
  ],
  "company_name": "Wizard",
  "requisition_id": 5068710004,
  "first_published": "2026-03-25T15:20:51-04:00",
  "application_deadline": null
}

Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/9d194732d3454c9b83768d3df0d0365c596a8b73?include=descriptionJSON

GET https://api.bluedoor.sh/job-postings/v1/orgs/a2329884-8c27-4643-9928-6b675096f8aeJSON

GET https://api.bluedoor.sh/job-postings/v1/sources/f75e55fe-59b1-47f1-b6d0-e10af602c0bfJSON

GET https://api.bluedoor.sh/job-postings/v1/jobs/9d194732d3454c9b83768d3df0d0365c596a8b73/eventsJSON

Docs · Get an API key