Home › Companies › Liquid Ai › Member of Technical Staff - Edge Inference Engineer

Member of Technical Staff - Edge Inference Engineer

Liquid Ai · San Francisco · Hybrid · Active · Ashby

Job facts

Field	Value
Company	Liquid Ai
Title	Member of Technical Staff - Edge Inference Engineer
Normalized title	-
Department / team	Research & Engineering / Research & Engineering
Location	San Francisco, CA, United States
Work model	Hybrid / Hybrid
Employment type	Full Time
Salary	-
Status	active
ATS provider	Ashby
Posted / first seen	— / 2026-05-29
Changed / last seen	2026-05-29 / 2026-06-06

Related slices

Page	What it contains	Open
Company jobs	Active postings from Liquid Ai.	Open
Company breakdowns	Role, location, ATS, and work model facets for this company.	Open
ATS provider jobs	Active postings observed through Ashby.	Open
Provider filtered search	The same provider as a filtered job collection.	Open
City jobs	Active postings in San Francisco.	Open
Department jobs	Active postings in Research & Engineering.	Open
Work model jobs	Active Hybrid postings.	Open
Lifecycle events	Open, update, close, and reopen events for this posting.	Open
Original posting	Canonical source or apply URL captured from the ATS.	Open

Linked records

Company	Liquid Ai
Source	742a7b52-7fdb-4b2a-9162-251683c8ccc0
ATS provider	Ashby

Description

About Liquid AI Spun out of MIT CSAIL, we build general-purpose AI systems that run efficiently across deployment targets, from data center accelerators to on-device hardware, ensuring low latency, minimal memory usage, privacy, and reliability. We partner with enterprises across consumer electronics, automotive, life sciences, and financial services. We are scaling rapidly and need exceptional people to help us get there. The Opportunity Our Edge Inference team compiles Liquid Foundation Models into optimized machine code that runs on resource-constrained devices: phones, laptops, Raspberry Pis, and watches. We are core contributors to llama.cpp and build the infrastructure that makes efficient on-device AI possible. You will work directly with the technical lead on problems that require deep understanding of both ML architectures and hardware constraints. This is high-ownership work where your code ships to production and directly impacts model performance on real devices. While San Francisco and Boston are preferred, we are open to other locations. What We're Looking For We need someone who: Works autonomously: Given a target device and performance goal, you figure out how to get there without hand-holding. You diagnose bottlenecks, prototype solutions, and iterate until you hit the target. Thinks at the hardware level: You understand cache hierarchies, memory access patterns, and instruction-level optimization. You can reason about why code is slow before reaching for a profiler. Bridges ML and systems: You understand how neural networks work mathematically (matrix operations, attention mechanisms, quantization effects) and can translate that understanding into optimized implementations. Ships production code: Our work goes upstream to open-source projects and deploys to customer devices. You write code that others can maintain and extend. The Work Implement and optimize inference kernels for CPU, NPU, and GPU architectures across diverse edge hardware Develop quantization strategies (INT4, INT8, FP8) that maximize compression while preserving model quality under strict memory budgets Contribute to llama.cpp and other open-source inference frameworks, including new model architectures (audio, vision) Profile and optimize end-to-end inference pipelines to achieve sub-100ms time-to-first-token on target devices Collaborate with ML researchers to understand model architectures and identify optimization opportunities specific to Liquid Foundation Models Desired Experience Must-have: 5+ years of experience in systems programming with strong C++ proficiency Embedded software engineering experience or work on resource-constrained systems Understanding of ML fundamentals at the linear algebra level (how matrix operations, attention, and quantization work) Experience with hardware architecture concepts: cache hierarchies, memory bandwidth, SIMD/vectorization Nice-to-have: Contributions to llama.cpp, ExecuTorch, or similar inference frameworks Experience with Rust for systems programming Background in custom accelerator development (TPU, NPU) or work at companies like SambaNova, Cerebras, Groq, or Google/Amazon accelerator teams Quantitative degree (mathematics, physics, or similar) combined with engineering experience What Success Looks Like (Year One) Ship optimizations that achieve measurable latency or memory improvements on at least one target edge device class Successfully upstream at least one significant contribution to llama.cpp (new architecture support, kernel optimization, or quantization improvement) Own a major workstream end-to-end, such as new model architecture support, quantization pipeline for a device constraint, or target platform enablement What We Offer Rare technical challenges: Work on novel model architectures that require custom optimization strategies. Your code ships to production and runs on real devices. Compensation: Competitive base salary with equity in a unicorn-stage company Health: We pay 100% of medical, dental, and vision premiums for employees and dependents Financial: 401(k) matching up to 4% of base pay Time Off: Unlimited PTO plus company-wide Refill Days throughout the year

Full job record

Job ID	b1aed3de7d18230d581a54c102c9d26b8326a1a0
Org ID	8e1f31f3-2052-48e9-ae14-b36a9ec2a6dd
Source ID	742a7b52-7fdb-4b2a-9162-251683c8ccc0
Board ID	742a7b52-7fdb-4b2a-9162-251683c8ccc0
Provider	ashby
Provider Job Key	1ed0e32c-11f4-4f93-bfab-bdfac37f0b1b
Title	Member of Technical Staff - Edge Inference Engineer
Normalized Title	—
Status	active
Active	yes
Location Text	San Francisco
Department	Research & Engineering
Team	Research & Engineering
Employment Type	full_time
Workplace Type	hybrid
Remote Policy	hybrid
Country	United States
Region	CA
City	San Francisco
Salary Raw	—
Salary Min	—
Salary Max	—
Salary Currency	—
Salary Period	—
Source URL	https://jobs.ashbyhq.com/liquid-ai/1ed0e32c-11f4-4f93-bfab-bdfac37f0b1b
Apply URL	https://jobs.ashbyhq.com/liquid-ai/1ed0e32c-11f4-4f93-bfab-bdfac37f0b1b/application
First Seen At	2026-05-29 06:16:09Z
Last Seen At	2026-06-06 09:15:31Z
Last Checked At	2026-06-06 09:15:31Z
Last Changed At	2026-05-29 06:16:09Z
Inactive At	—
Source Posted At	—
Source Updated At	—
Raw Payload Uri	s3://job-postings-prod-raw-590183727216/raw/provider=ashby/board=liquid-ai/date=2026-06-06/2026-06-06T09-15-21-849Z-b5fc798149de9351214373470cfd157c647e407a6863d96db62ef3ef57fc83e6.json

Event Fields

{
  "content_hash": "d6661f7823241b7a69fed7f9fa37d9e10cc89597a6000d14847adb9026016023",
  "source_hash": "a9a0eb631aaa76ab6759dd829059e91c3d723dedefc84039162cc919eb12314f",
  "last_changed_at": "2026-05-29T06:16:09.429Z",
  "active_status": "active"
}

Parsed Structured

{
  "language": "en",
  "location": {
    "raw": "San Francisco",
    "city": "San Francisco",
    "region": "CA",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.75
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-06T09:15:31.124Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en",
    "location": {
      "raw": "San Francisco",
      "city": "San Francisco",
      "region": "CA",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.75
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": "hybrid",
  "salary_period": null,
  "workplace_type": "hybrid",
  "salary_currency": null
}

Extensions

{}

Native Structured

{
  "id": "1ed0e32c-11f4-4f93-bfab-bdfac37f0b1b",
  "team": "Research & Engineering",
  "title": "Member of Technical Staff - Edge Inference Engineer",
  "jobUrl": "https://jobs.ashbyhq.com/liquid-ai/1ed0e32c-11f4-4f93-bfab-bdfac37f0b1b",
  "address": null,
  "applyUrl": "https://jobs.ashbyhq.com/liquid-ai/1ed0e32c-11f4-4f93-bfab-bdfac37f0b1b/application",
  "isListed": true,
  "isRemote": false,
  "location": "San Francisco",
  "updatedAt": null,
  "apiVersion": "ashby-non-user-graphql-v1",
  "department": "Research & Engineering",
  "publishedAt": null,
  "workplaceType": "Hybrid",
  "employmentType": "FullTime",
  "secondaryLocations": [
    {
      "location": "Boston"
    },
    {
      "location": "Remote"
    }
  ]
}

Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/b1aed3de7d18230d581a54c102c9d26b8326a1a0?include=descriptionJSON

GET https://api.bluedoor.sh/job-postings/v1/orgs/8e1f31f3-2052-48e9-ae14-b36a9ec2a6ddJSON

GET https://api.bluedoor.sh/job-postings/v1/sources/742a7b52-7fdb-4b2a-9162-251683c8ccc0JSON

GET https://api.bluedoor.sh/job-postings/v1/jobs/b1aed3de7d18230d581a54c102c9d26b8326a1a0/eventsJSON

Docs · Get an API key