Home › Companies › Field Ai › Staff ML Systems Engineer, Distributed Systems

Staff ML Systems Engineer, Distributed Systems

Field Ai · Seattle, WA · On Site · Active · $170,000–$200,000 / year · Lever

Job facts

Field	Value
Company	Field Ai
Title	Staff ML Systems Engineer, Distributed Systems
Normalized title	-
Department / team	Engineering / Systems Engineering
Location	Seattle, WA, United States
Work model	On Site
Employment type	Full Time
Salary	$170,000–$200,000 / year
Status	active
ATS provider	Lever
Posted / first seen	2026-05-29 / 2026-05-30
Changed / last seen	2026-05-30 / 2026-06-06

Related slices

Page	What it contains	Open
Company jobs	Active postings from Field Ai.	Open
Company breakdowns	Role, location, ATS, and work model facets for this company.	Open
ATS provider jobs	Active postings observed through Lever.	Open
Provider filtered search	The same provider as a filtered job collection.	Open
City jobs	Active postings in Seattle.	Open
Department jobs	Active postings in Engineering.	Open
Work model jobs	Active On Site postings.	Open
Lifecycle events	Open, update, close, and reopen events for this posting.	Open
Original posting	Canonical source or apply URL captured from the ATS.	Open

Linked records

Company	Field Ai
Source	00217bf9-4e5d-4daa-a3cd-2e1693bf435f
ATS provider	Lever

Description

FieldAI’s Irvine team is where embodied AI meets real robots, real sensors, and real field deployments. Based in the heart of Southern California’s robotics ecosystem, we build risk-aware, reliable, field-ready AI systems that solve the hardest problems in robotics and unlock the full potential of embodied intelligence. If you want your work to ship, get tested on hardware, and improve through real deployments, Irvine is the place. We go beyond typical data-driven approaches or pure transformer-only architectures, combining rigorous engineering with learning systems proven in globally deployed solutions that deliver results today and get better every time our robots run in the field. We are seeking a Senior / Staff ML Systems Engineer to architect and build the distributed infrastructure that powers large-scale machine learning workflows across the organization. This role sits at the intersection of machine learning, distributed systems, and platform engineering. You will be responsible for designing scalable systems that support data processing, model training, evaluation, and post-processing pipelines while enabling ML teams to efficiently develop, operate, and scale production-grade workflows. You will play a critical role in defining the architectural patterns, tooling, and infrastructure that underpin our machine learning platform. Our salary range is generous and we consider each individual’s background and experience when determining final compensation. Base pay may vary based on role scope, job-related knowledge, skills, experience, and the Irvine, California market. Why Join FieldAI in Irvine? In Irvine, you will work where the robots are. Our local team builds and tests systems on real hardware with real sensors, then ships them to operate in unstructured, previously unknown environments around the world. We are solving one of robotics’ hardest challenges: reliable deployment outside the lab. Our Field Foundational Models™ raise the bar for perception, planning, localization, and manipulation, with an emphasis on explainability and safety for real-world use. You will collaborate with a world-class team that thrives on creativity, resilience, and bold thinking. We bring deep experience from organizations such as DeepMind, NASA JPL, Boston Dynamics, NVIDIA, Amazon, Tesla Autopilot, Cruise, Zoox, Toyota Research Institute, and SpaceX, along with a track record of field deployments and strong performance in DARPA challenge segments. Be Part of the Next Robotics Revolution We are looking for builders who want their work to leave the whiteboard and show up on robots. If you enjoy tackling tough, uncharted questions and working across disciplines, you will find your people here. Our teams span AI, software, robotics engineering, product, field deployment, and technical communication, all focused on shipping systems that perform in the real world. Our headquarters is in Irvine, and we partner closely with teams there as well as colleagues across the US and around the world. Join us in Southern California and help define what dependable, field-ready autonomy looks like. We value diverse perspectives and are committed to fostering an inclusive workplace. We evaluate candidates and employees based on merit, qualifications, and performance, and we do not discriminate on the basis of race, color, gender, national origin, ethnicity, veteran status, disability status, age, sexual orientation, gender identity, marital status, or any other legally protected statu What You'll Get To Do Design and build scalable distributed machine learning pipelines across data processing, model training, evaluation, and post-processing workflows. Architect distributed execution systems, including parallelization strategies, workload scheduling, resource allocation, and fault tolerance mechanisms. Develop reusable abstractions, frameworks, and libraries that simplify distributed pipeline development. Optimize performance across distributed CPU and GPU environments, improving throughput, utilization, and reliability. Design systems that effectively manage data partitioning, memory utilization, serialization overhead, and compute efficiency. Partner closely with ML engineers, data engineers, and infrastructure teams to productionize research workflows and enable large-scale model development. Establish best practices and engineering standards for distributed machine learning infrastructure. Evaluate and guide decisions around distributed computing frameworks, infrastructure technologies, and system design trade-offs. Improve observability, debugging, monitoring, and operational tooling for distributed systems at scale. What You Have 5+ years of experience building distributed systems, backend infrastructure, machine learning platforms, or large-scale data processing systems. Strong Python programming skills, including experience with concurrency, performance optimization, and systems development. Experience with distributed computing frameworks such as Ray, Spark, Dask, Flink, or similar technologies. Experience designing and scaling data pipelines or machine learning workflows. Strong system design skills with demonstrated expertise in scalability, reliability, and performance optimization. Experience diagnosing and resolving bottlenecks in distributed environments. Ability to work cross-functionally and drive technical decisions across multiple teams. The Extras That Set You Apart Experience building infrastructure for machine learning training and inference systems. Familiarity with modern ML frameworks such as PyTorch or TensorFlow. Experience with multi-node or multi-GPU training architectures, including DDP, FSDP, DeepSpeed, or similar technologies. Experience operating Kubernetes-based infrastructure and large-scale cloud systems. Deep understanding of distributed systems concepts including data locality, serialization costs, scheduling, and resource management. Experience with distributed debugging, observability, and workflow orchestration platforms. Proven ability to establish technical direction and influence architecture across organizations.

Full job record

Job ID	b97c4a7bf4ad8e1365b612085a432bd455eef502
Org ID	8c469bee-2525-4c11-a8b5-16134aeb740d
Source ID	00217bf9-4e5d-4daa-a3cd-2e1693bf435f
Board ID	00217bf9-4e5d-4daa-a3cd-2e1693bf435f
Provider	lever
Provider Job Key	b25116fc-e172-4016-80ad-3b2ce10da08b
Title	Staff ML Systems Engineer, Distributed Systems
Normalized Title	—
Status	active
Active	yes
Location Text	Seattle, WA
Department	Engineering
Team	Systems Engineering
Employment Type	Full time
Workplace Type	on_site
Remote Policy	—
Country	United States
Region	WA
City	Seattle
Salary Raw	USD 170000-200000 per-year-salary
Salary Min	170,000
Salary Max	200,000
Salary Currency	USD
Salary Period	year
Source URL	https://jobs.lever.co/field-ai/b25116fc-e172-4016-80ad-3b2ce10da08b
Apply URL	https://jobs.lever.co/field-ai/b25116fc-e172-4016-80ad-3b2ce10da08b/apply
First Seen At	2026-05-30 07:34:55Z
Last Seen At	2026-06-06 18:40:26Z
Last Checked At	2026-06-06 18:40:26Z
Last Changed At	2026-05-30 07:34:55Z
Inactive At	—
Source Posted At	2026-05-29 18:45:15Z
Source Updated At	—
Raw Payload Uri	s3://job-postings-prod-raw-590183727216/raw/provider=lever/board=field-ai/date=2026-06-06/2026-06-06T18-40-25-026Z-a3498220a9243b523423282565785e22f4f036ec00b36ba5a365a2818919abd4.json

Event Fields

{
  "content_hash": "8f1eefcd9496c41b3fef02e4fa4f3bc1da1fae18e21579d5d952333ff6e8d98c",
  "source_hash": "4552f5bf6c5ffede5107f86a8b9269fc8a2e24704b8032406df5815fa1b1642a",
  "last_changed_at": "2026-05-30T07:34:55.874Z",
  "active_status": "active"
}

Parsed Structured

{
  "language": "en",
  "location": {
    "raw": "Seattle, WA",
    "city": "Seattle",
    "region": "WA",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.9
  },
  "salary_max": 200000,
  "salary_min": 170000,
  "inferred_at": "2026-06-06T18:40:26.132Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en",
    "location": {
      "raw": "Seattle, WA",
      "city": "Seattle",
      "region": "WA",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.9
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": null,
  "salary_period": "year",
  "workplace_type": "on_site",
  "salary_currency": "USD"
}

Extensions

{}

Native Structured

{
  "lists": [
    {
      "text": "What You'll Get To Do",
      "content": "\n<li>Design and build scalable distributed machine learning pipelines across data processing, model training, evaluation, and post-processing workflows.</li>\n<li>Architect distributed execution systems, including parallelization strategies, workload scheduling, resource allocation, and fault tolerance mechanisms.</li>\n<li>Develop reusable abstractions, frameworks, and libraries that simplify distributed pipeline development.</li>\n<li>Optimize performance across distributed CPU and GPU environments, improving throughput, utilization, and reliability.</li>\n<li>Design systems that effectively manage data partitioning, memory utilization, serialization overhead, and compute efficiency.</li>\n<li>Partner closely with ML engineers, data engineers, and infrastructure teams to productionize research workflows and enable large-scale model development.</li>\n<li>Establish best practices and engineering standards for distributed machine learning infrastructure.</li>\n<li>Evaluate and guide decisions around distributed computing frameworks, infrastructure technologies, and system design trade-offs.</li>\n<li>Improve observability, debugging, monitoring, and operational tooling for distributed systems at scale.</li>\n"
    },
    {
      "text": "What You Have",
      "content": "\n<li>5+ years of experience building distributed systems, backend infrastructure, machine learning platforms, or large-scale data processing systems.</li>\n<li>Strong Python programming skills, including experience with concurrency, performance optimization, and systems development.</li>\n<li>Experience with distributed computing frameworks such as Ray, Spark, Dask, Flink, or similar technologies.</li>\n<li>Experience designing and scaling data pipelines or machine learning workflows.</li>\n<li>Strong system design skills with demonstrated expertise in scalability, reliability, and performance optimization.</li>\n<li>Experience diagnosing and resolving bottlenecks in distributed environments.</li>\n<li>Ability to work cross-functionally and drive technical decisions across multiple teams.</li>\n"
    },
    {
      "text": "The Extras That Set You Apart",
      "content": "\n<li>Experience building infrastructure for machine learning training and inference systems.</li>\n<li>Familiarity with modern ML frameworks such as PyTorch or TensorFlow.</li>\n<li>Experience with multi-node or multi-GPU training architectures, including DDP, FSDP, DeepSpeed, or similar technologies.</li>\n<li>Experience operating Kubernetes-based infrastructure and large-scale cloud systems.</li>\n<li>Deep understanding of distributed systems concepts including data locality, serialization costs, scheduling, and resource management.</li>\n<li>Experience with distributed debugging, observability, and workflow orchestration platforms.</li>\n<li>Proven ability to establish technical direction and influence architecture across organizations.</li>\n"
    }
  ],
  "country": "US",
  "createdAt": 1780080315619,
  "updatedAt": null,
  "categories": {
    "team": "Systems Engineering",
    "location": "Seattle, WA",
    "commitment": "Full time",
    "department": "Engineering",
    "allLocations": [
      "Seattle, WA",
      "Irvine, CA"
    ]
  },
  "salaryRange": {
    "max": 200000,
    "min": 170000,
    "currency": "USD",
    "interval": "per-year-salary"
  },
  "workplaceType": "onsite"
}

Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/b97c4a7bf4ad8e1365b612085a432bd455eef502?include=descriptionJSON

GET https://api.bluedoor.sh/job-postings/v1/orgs/8c469bee-2525-4c11-a8b5-16134aeb740dJSON

GET https://api.bluedoor.sh/job-postings/v1/sources/00217bf9-4e5d-4daa-a3cd-2e1693bf435fJSON

GET https://api.bluedoor.sh/job-postings/v1/jobs/b97c4a7bf4ad8e1365b612085a432bd455eef502/eventsJSON

Docs · Get an API key