Home › Companies › Fieldguide › Senior AI Engineer, Quality

Senior AI Engineer, Quality

Fieldguide · San Francisco, CA or Remote (USA) · Hybrid · Active · Ashby

Job facts

Field	Value
Company	Fieldguide
Title	Senior AI Engineer, Quality
Normalized title	-
Department / team	Engineering, Product, and Design / Engineering, Product, and Design, Engineering
Location	San Francisco, CA, United States
Work model	Hybrid / Hybrid
Employment type	Full Time
Salary	-
Status	active
ATS provider	Ashby
Posted / first seen	— / 2026-05-29
Changed / last seen	2026-05-29 / 2026-06-06

Related slices

Page	What it contains	Open
Company jobs	Active postings from Fieldguide.	Open
Company breakdowns	Role, location, ATS, and work model facets for this company.	Open
ATS provider jobs	Active postings observed through Ashby.	Open
Provider filtered search	The same provider as a filtered job collection.	Open
City jobs	Active postings in San Francisco.	Open
Department jobs	Active postings in Engineering, Product, and Design.	Open
Work model jobs	Active Hybrid postings.	Open
Lifecycle events	Open, update, close, and reopen events for this posting.	Open
Original posting	Canonical source or apply URL captured from the ATS.	Open

Linked records

Company	Fieldguide
Source	a73ca118-e5e6-4cde-81e2-9b30cfd2b72d
ATS provider	Ashby

Description

About Us Fieldguide is establishing a new state of trust for global commerce and capital markets through automating and streamlining the work of assurance and audit practitioners specifically within cybersecurity, privacy, and financial audit. Put simply, we build software for the people who enable trust between businesses. We’re based in San Francisco, CA, but built as a remote-first company that enables you to do your best work from anywhere. We're backed by top investors including Growth Equity at Goldman Sachs Alternatives, Bessemer Venture Partners, 8VC, Floodgate, Y Combinator, DNX Ventures, Global Founders Capital, Justin Kan, Elad Gil, and more. We value diversity — in backgrounds and in experiences. We need people from all backgrounds and walks of life to help build the future of audit and advisory. Fieldguide’s team is inclusive, driven, humble and supportive. We are deliberate and self-reflective about the kind of team and culture that we are building, seeking teammates that are not only strong in their own aptitudes but care deeply about supporting each other's growth. As an early stage start-up employee, you’ll have the opportunity to build out the future of business trust. We make audit practitioners’ lives easier by bringing together up to 50% of their work and giving them better work-life balance. If you share our values and enthusiasm for building a great culture and product, you will find a home at Fieldguide. About the Role Fieldguide is building AI agents for the most complex audit and advisory workflows. We're a San Francisco-based Vertical AI company building in a $100B+ market undergoing rapid transformation. Over 50 of the top 100 accounting and consulting firms trust us to power their most mission-critical work. We're backed by Bessemer Venture Partners, 8VC, Floodgate, Y Combinator, Elad Gil, and other top-tier investors. As a Senior AI Engineer, Quality , you will own the evaluation infrastructure that ensures our AI agents perform reliably at enterprise scale. This role is 100% focused on making evaluations a first-class engineering capability: building the unified platform, automated pipelines, and production feedback loops that let us evaluate any new model against all critical workflows within hours. You'll work at the intersection of ML engineering, observability, and quality assurance to ensure our agents meet the rigorous standards our customers demand. We're hiring across all levels. We'll calibrate seniority during interviews based on your background and what you're looking to own. This role is for engineers who value in-person collaboration at our San Francisco, CA office. What You'll Own Measurable AI Agents Design and build a unified evaluation platform that serves as the single source of truth for all of our agentic systems and audit workflows Build observability systems that surface agent behavior, trace execution, and failure modes in production, and feedback loops that turn production failures into first-class evaluation cases Own the evaluation infrastructure stack including integration with LangSmith and LangGraph. Translate customer problems into concrete agent behaviors and workflows Integrate and orchestrate LLMs, tools, retrieval systems, and logic into cohesive, reliable agent experiences Rapid Model Evaluation Build automated pipelines that evaluate new models against all critical workflows within hours of release Design evaluation harnesses for our most complex Agentic systems and workflows Implement comparison frameworks that measure effectiveness, consistency, latency, and cost across model versions Design guardrails and monitoring systems that catch quality regressions before they reach customers AI-native engineering execution Use AI as core leverage in how you design, build, test, and iterate Prototype quickly to resolve uncertainty, then harden systems for enterprise-grade reliability Build evaluations, feedback mechanisms, and guardrails so agents improve over time Work with SMEs and ML Engineers to create evaluation datasets by curating production traces. Design prompts, retrieval pipelines, and agent orchestration systems that perform reliably at scale Ownership of Quality and Large Product Areas Define and document evaluation standards, best practices, and processes for the engineering organization Advocate for evaluation-driven development and make it easy for the team to write and run evals Partner with product and ML engineers to integrate evaluation requirements into agent development from day one Take full ownership of large product areas rather than executing on narrow tasks Who You Are You are an engineer who believes that evaluations are foundational to building reliable AI systems, not a nice-to-have. The following operating principles should resonate with you: Evaluation-first mindset: You understand that for an AI company, not being able to evaluate a new model quickly is unacceptable AI-native instincts: You treat LLMs, agents, and automation as fundamental building blocks and parts of the craft of engineering Data-driven rigor: You make decisions based on metrics and are obsessed with measuring what matters Production-oriented: You understand that evaluations must work on real production behavior, not just offline datasets Strong product judgment: You can decide what matters and why, without waiting for guidance, not just how to implement it Bias to building: You move fast and build working systems rather than perfect specifications Experience We care more about capability and trajectory than years on a resume, but most strong candidates will have: Multiple years of experience shipping production software in complex, real-world systems Experience with TypeScript, React, Python, and Postgres Built and deployed LLM-powered features serving production traffic Implemented evaluation frameworks for model outputs and agent behaviors Designed observability or tracing infrastructure for AI/ML systems Worked with vector databases, embedding models, and RAG architectures Experience with evaluation platforms (LangSmith, Langfuse, or similar) Comfort operating in ambiguity and taking responsibility for outcomes Deep empathy for professional-grade, mission-critical software (experience with audit and accounting workflows are not required) What Should Excite You Agent reliability at enterprise scale: Building systems that professionals depend on Balancing automation with human oversight: Knowing when to automate and when to surface decisions to experts Production feedback loops: Turning real-world agent failures into systematic improvements Explaining AI decisions: Making all forms of AI outputs and agent reasoning transparent and trustworthy Evaluation for nuanced domains: Structuring data and feedback for workflows where ground truth requires expert judgment High-impact visibility: Your work directly enables leadership to confidently communicate AI quality to the board and customers More about Fieldguide: Fieldguide is a values-based company. Our values are: Fearless - Inspire & break down seemingly impossible walls. Fast - Launch fast with excellence, iterate to perfection. Lovable - Deliver happiness & 11 star experiences. Owners - Execute & run the business with ownership. Win-win - Create mutual value & earn trust for life. Inclusive - Scale the best ideas with inclusive teams. Some of our benefits include: Competitive compensation packages with meaningful ownership Flexible PTO 401k Wellness benefits, including a bundle of free therapy sessions Technology & Work from Home reimbursement Flexible work schedules

Full job record

Job ID	652333dbc5dbac68ba217d18b8a1b7b0c2b4f441
Org ID	56d4b9f8-0f99-4b81-ba78-73783d191a7c
Source ID	a73ca118-e5e6-4cde-81e2-9b30cfd2b72d
Board ID	a73ca118-e5e6-4cde-81e2-9b30cfd2b72d
Provider	ashby
Provider Job Key	f4f0aea0-826d-451f-bd17-b04772e221cc
Title	Senior AI Engineer, Quality
Normalized Title	—
Status	active
Active	yes
Location Text	San Francisco, CA or Remote (USA)
Department	Engineering, Product, and Design
Team	Engineering, Product, and Design, Engineering
Employment Type	full_time
Workplace Type	hybrid
Remote Policy	hybrid
Country	United States
Region	CA
City	San Francisco
Salary Raw	—
Salary Min	—
Salary Max	—
Salary Currency	—
Salary Period	—
Source URL	https://jobs.ashbyhq.com/fieldguide/f4f0aea0-826d-451f-bd17-b04772e221cc
Apply URL	https://jobs.ashbyhq.com/fieldguide/f4f0aea0-826d-451f-bd17-b04772e221cc/application
First Seen At	2026-05-29 06:21:06Z
Last Seen At	2026-06-06 09:25:06Z
Last Checked At	2026-06-06 09:25:06Z
Last Changed At	2026-05-29 06:21:06Z
Inactive At	—
Source Posted At	—
Source Updated At	—
Raw Payload Uri	s3://job-postings-prod-raw-590183727216/raw/provider=ashby/board=fieldguide/date=2026-06-06/2026-06-06T09-24-42-785Z-58209a16c84a062909c51d9cf62d610d2137fca7766770b94b5a73d77be1250c.json

Event Fields

{
  "content_hash": "722dceee27d2b88c9c7522b8d8c922ac1a6ff033b4dbcdeb465d25aee855fcce",
  "source_hash": "d6ca0cfd6f23a61e6d278fc9b882bc7cb94571cec34752664fbae7b8864ce267",
  "last_changed_at": "2026-05-29T06:21:06.082Z",
  "active_status": "active"
}

Parsed Structured

{
  "language": "en",
  "location": {
    "raw": "San Francisco, CA",
    "city": "San Francisco",
    "region": "CA",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.9
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-06T09:25:06.761Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en",
    "location": {
      "raw": "San Francisco, CA",
      "city": "San Francisco",
      "region": "CA",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.9
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": "hybrid",
  "salary_period": null,
  "workplace_type": "hybrid",
  "salary_currency": null
}

Extensions

{}

Native Structured

{
  "id": "f4f0aea0-826d-451f-bd17-b04772e221cc",
  "team": "Engineering, Product, and Design, Engineering",
  "title": "Senior AI Engineer, Quality",
  "jobUrl": "https://jobs.ashbyhq.com/fieldguide/f4f0aea0-826d-451f-bd17-b04772e221cc",
  "address": null,
  "applyUrl": "https://jobs.ashbyhq.com/fieldguide/f4f0aea0-826d-451f-bd17-b04772e221cc/application",
  "isListed": true,
  "isRemote": true,
  "location": "San Francisco, CA or Remote (USA)",
  "updatedAt": null,
  "apiVersion": "ashby-non-user-graphql-v1",
  "department": "Engineering, Product, and Design",
  "publishedAt": null,
  "workplaceType": "Hybrid",
  "employmentType": "FullTime",
  "secondaryLocations": []
}

Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/652333dbc5dbac68ba217d18b8a1b7b0c2b4f441?include=descriptionJSON

GET https://api.bluedoor.sh/job-postings/v1/orgs/56d4b9f8-0f99-4b81-ba78-73783d191a7cJSON

GET https://api.bluedoor.sh/job-postings/v1/sources/a73ca118-e5e6-4cde-81e2-9b30cfd2b72dJSON

GET https://api.bluedoor.sh/job-postings/v1/jobs/652333dbc5dbac68ba217d18b8a1b7b0c2b4f441/eventsJSON

Docs · Get an API key