bluedoor data·Job Postings API·bluedoor.sh ↗

HomeCompaniesMpathic2Red Teaming Expert

Red Teaming Expert

Mpathic2 · Active · $30 / hour · BambooHR

Job facts

FieldValue
CompanyMpathic2
TitleRed Teaming Expert
Normalized title-
Department / teamExperts
LocationSeattle, WA, United States
Work model-
Employment typeContract
Salary$30 / hour
Statusactive
ATS providerBambooHR
Posted / first seen2026-04-29 / 2026-05-30
Changed / last seen2026-05-30 / 2026-06-06

Related slices

PageWhat it containsOpen
Company jobsActive postings from Mpathic2.Open
Company breakdownsRole, location, ATS, and work model facets for this company.Open
ATS provider jobsActive postings observed through BambooHR.Open
Provider filtered searchThe same provider as a filtered job collection.Open
City jobsActive postings in Seattle.Open
Department jobsActive postings in Experts.Open
Lifecycle eventsOpen, update, close, and reopen events for this posting.Open
Original postingCanonical source or apply URL captured from the ATS.Open

Linked records

CompanyMpathic2
Sourceb1af6ab1-26b4-4778-a1f4-d8ae41a6f240
ATS providerBambooHR

Description

About mpathic.ai Keeping the human in AI. mpathic is a trusted leader in advancing quality and safety in AI systems through expert-led evaluation and human data. We partner with leading technology companies to support red teaming, trust & safety, expert annotation, and model evaluation across high-stakes domains. About the Role mpathic is seeking part-time, project-based Red Teaming Experts to support a red-teaming and evaluation campaign focused on AI safety and model behavior in sensitive, real-world interactions. In this role, you will design, simulate, and evaluate conversations with AI systems to assess safety, risk, and behavioral performance. You will identify failure modes, edge cases, and policy gaps—particularly in scenarios involving distress, ambiguity, or escalation. This role involves roleplaying and reviewing clinical scenarios with AI agents. As such, we are ideally seeking candidates who bring creative or performance-driven strengths , as these competencies enhance the realism, nuance, and emotional depth needed for AI safety testing. Examples of these can include, but are not limited to: Theatre degrees or studies Acting, theatre, improv, or voice-over experience Strong writing skills, especially dialogue or scenario writing Experience creating or inhabiting characters (e.g., performers, TTRPG roleplay, narrative designers) Conversational design, interaction writing, or scripted roleplay experience Participation in gaming, interactive storytelling, or digital communities where roleplay is common What You’ll Be Working On You will help identify, prevent, and characterize risks that emerge when users interact with AI systems. Responsibilities may include: Designing and executing red-teaming scenarios across diverse user behaviors Reviewing AI-generated responses for safety, accuracy, and policy compliance Identifying failure modes, edge cases, and behavioral risks Assessing whether AI appropriately recognizes and responds to distress or escalation Evaluating tone, boundaries, and appropriateness in sensitive interactions Detecting misleading, overconfident, or unsafe responses Evaluating multi-turn conversations for consistency and risk handling Identifying gaps in responses, including missed signals or incomplete handling Conducting qualitative analysis to identify behavioral patterns and system weaknesses Documenting edge cases, failure patterns, and safety risks Applying or contributing to evaluation rubrics, taxonomies, and frameworks Supporting quality assurance (QA) to ensure consistency across evaluations Collaborating with internal teams on AI safety and evaluation improvements Participating in red teaming exercises to surface system vulnerabilities Maintaining strict confidentiality and quality standards What We’re Looking For Successful candidates are detail-oriented, analytically strong, and experienced in evaluating or stress-testing AI systems in complex or high-risk scenarios. Professional experience in one or more of the following: LLM red teaming or AI safety evaluation Trust & safety, content moderation, or policy enforcement AI/ML evaluation, annotation, or QA workflows Conversational analysis or behavioral risk assessment Work involving sensitive or high-stakes user interactions Strong understanding of: AI safety principles and common failure modes Behavioral risk, escalation patterns, and edge-case handling Mental health sensitivity, boundaries, and responsible AI behavior How users express distress, confusion, or harmful intent in conversation Ability to identify: Safety violations and policy gaps Missed or mishandled risk signals Unsafe, misleading, or overconfident responses Inappropriate tone or boundary-setting Failures in escalation, de-escalation, or resolution Inconsistencies across multi-turn interactions Experience with or Interest in: Red teaming methodologies and adversarial testing Evaluating conversational AI systems or chatbots Developing or applying evaluation frameworks and rubrics Understanding how AI systems perform under real user behavior Comfort with: Tech tools and platforms (Slack, spreadsheets, dashboards) Evaluating AI-generated responses (no coding required, but must be tech-comfortable) Ambiguity, iteration, and feedback-driven workflows Willingness to: Sign NDAs and work with sensitive or high-impact content Nice to Have (Not Required) Background in mental health, behavioral science, or psychology Experience in QA, annotation, or qualitative analysis Experience with AI systems in sensitive domains (e.g., healthcare, safety) Familiarity with evaluation metrics or safety frameworks Compensation $30-60/hour, depending on experience and specific project tasks/difficulty

Full job record

Job ID2e73108d54307d25a65ea85208ee08ea7e7850b9
Org IDa49c47a0-5ae6-4084-89b9-187ae791ed8b
Source IDb1af6ab1-26b4-4778-a1f4-d8ae41a6f240
Board IDb1af6ab1-26b4-4778-a1f4-d8ae41a6f240
Providerbamboohr
Provider Job Key74
TitleRed Teaming Expert
Normalized Title
Statusactive
Activeyes
Location Text
DepartmentExperts
Team
Employment Typecontract
Workplace Type
Remote Policy
CountryUnited States
RegionWA
CitySeattle
Salary RawCompensation $30-60/hour, depending on experience and specific project tasks/difficulty
Salary Min30
Salary Max
Salary CurrencyUSD
Salary Periodhour
Source URLhttps://mpathic2.bamboohr.com/careers/74
Apply URLhttps://mpathic2.bamboohr.com/careers/74
First Seen At2026-05-30 06:04:22Z
Last Seen At2026-06-06 09:39:29Z
Last Checked At2026-06-06 09:39:29Z
Last Changed At2026-05-30 06:04:22Z
Inactive At
Source Posted At2026-04-29 00:00:00Z
Source Updated At
Raw Payload Uris3://job-postings-prod-raw-590183727216/raw/provider=bamboohr/board=mpathic2/date=2026-06-06/2026-06-06T09-39-28-431Z-48285aeccaa3acf5233909cb80ca672f5c387545ccdedcf0178dd8e5287f8d09.json
Event Fields
{
  "content_hash": "0b84b4e91bd95783f352f29e3dbfd60713b0ac198cbd51af77990acc5497635b",
  "source_hash": "4bd6e15f43b088dee20262dc65c8e85b206a1c1b8c1f09a48969cdd05978bbec",
  "last_changed_at": "2026-05-30T06:04:22.595Z",
  "active_status": "active"
}
Parsed Structured
{
  "language": "en",
  "location": {
    "raw": "Seattle, Washington, United States",
    "city": "Seattle",
    "region": "WA",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.8
  },
  "salary_max": null,
  "salary_min": 30,
  "inferred_at": "2026-06-06T09:39:29.915Z",
  "launch_scope": {
    "reason": "bamboohr_production_catalog",
    "included": true,
    "location": {
      "raw": "Seattle, Washington, United States",
      "city": "Seattle",
      "region": "WA",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.8
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": null,
  "salary_period": "hour",
  "workplace_type": null,
  "salary_currency": "USD"
}
Extensions
{}
Native Structured
{
  "list_job": {
    "id": "74",
    "isRemote": null,
    "location": {
      "city": null,
      "state": null
    },
    "atsLocation": {
      "city": "Seattle",
      "state": "Washington",
      "country": "United States",
      "province": null
    },
    "departmentId": "18634",
    "locationType": "1",
    "jobOpeningName": "Red Teaming Expert",
    "departmentLabel": "Experts",
    "employmentStatusLabel": "Contractor"
  },
  "detail_errors": [],
  "detail_job_opening": {
    "location": {
      "city": null,
      "state": null,
      "postalCode": null,
      "addressCountry": null
    },
    "datePosted": "2026-04-29",
    "atsLocation": {
      "city": "Seattle",
      "state": "Washington",
      "country": "United States",
      "countryId": "1"
    },
    "description": "<ul></ul>\n<p><span style=\"font-family: Inter, sans-serif; font-size: 12pt; font-weight: bold\">About mpathic.ai</span></p>\n<p><span style=\"font-size: 10pt\">Keeping the human in AI. mpathic is a trusted leader in advancing quality and safety in AI systems through expert-led evaluation and human data. We partner with leading technology companies to support red teaming, trust &amp; safety, expert annotation, and model evaluation across high-stakes domains.</span></p>\n<p><br></p>\n<p><span style=\"font-family: Inter, sans-serif; font-size: 12pt; font-weight: bold\">About the Role</span></p>\n<p><span style=\"font-size: 10pt\">mpathic is seeking <span style=\"font-weight: bold\">part-time, project-based Red Teaming Experts</span> to support a red-teaming and evaluation campaign focused on AI safety and model behavior in sensitive, real-world interactions.</span></p>\n<p><br><br></p>\n<p><span style=\"font-size: 10pt\">In this role, you will design, simulate, and evaluate conversations with AI systems to assess safety, risk, and behavioral performance. You will identify failure modes, edge cases, and policy gaps—particularly in scenarios involving distress, ambiguity, or escalation.</span><br></p>\n<p><br><br></p>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt\"><span style=\"font-weight: bold\">This role involves roleplaying and reviewing clinical scenarios with AI agents. </span>As such, we are ideally seeking candidates who bring <span style=\"font-weight: bold\">creative or performance-driven strengths</span>, as these competencies enhance the realism, nuance, and emotional depth needed for AI safety testing. Examples of these can include, but are not limited to: </span></p>\n<ul>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Theatre degrees or studies</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Acting, theatre, improv, or voice-over experience </span></li>\n</ul>\n<ul>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Strong writing skills, especially dialogue or scenario writing </span></li>\n</ul>\n<ul>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Experience creating or inhabiting characters (e.g., performers, TTRPG roleplay, narrative designers) </span></li>\n</ul>\n<ul>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Conversational design, interaction writing, or scripted roleplay experience </span></li>\n</ul>\n<ul>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Participation in gaming, interactive storytelling, or digital communities where roleplay is common </span></li>\n</ul>\n<p><br></p>\n<p><span style=\"font-family: Inter, sans-serif; font-size: 10pt\"><span style=\"font-family: Inter, sans-serif; font-size: 12pt; font-weight: bold\">What You’ll Be Working On </span></span></p>\n<p><span style=\"font-size: 10pt\">You will help identify, prevent, and characterize risks that emerge when users interact with AI systems.</span></p>\n<p><br><br></p>\n<p><span style=\"font-size: 10pt\">Responsibilities may include:</span></p>\n<ul>\n<li><span style=\"font-size: 10pt\">Designing and executing red-teaming scenarios across diverse user behaviors</span></li>\n<li><span style=\"font-size: 10pt\">Reviewing AI-generated responses for safety, accuracy, and policy compliance</span></li>\n<li><span style=\"font-size: 10pt\">Identifying failure modes, edge cases, and behavioral risks</span></li>\n<li><span style=\"font-size: 10pt\">Assessing whether AI appropriately recognizes and responds to distress or escalation</span></li>\n<li><span style=\"font-size: 10pt\">Evaluating tone, boundaries, and appropriateness in sensitive interactions</span></li>\n<li><span style=\"font-size: 10pt\">Detecting misleading, overconfident, or unsafe responses</span></li>\n<li><span style=\"font-size: 10pt\">Evaluating multi-turn conversations for consistency and risk handling</span></li>\n<li><span style=\"font-size: 10pt\">Identifying gaps in responses, including missed signals or incomplete handling</span></li>\n<li><span style=\"font-size: 10pt\">Conducting qualitative analysis to identify behavioral patterns and system weaknesses</span></li>\n<li><span style=\"font-size: 10pt\">Documenting edge cases, failure patterns, and safety risks</span></li>\n<li><span style=\"font-size: 10pt\">Applying or contributing to evaluation rubrics, taxonomies, and frameworks</span></li>\n<li><span style=\"font-size: 10pt\">Supporting quality assurance (QA) to ensure consistency across evaluations</span></li>\n<li><span style=\"font-size: 10pt\">Collaborating with internal teams on AI safety and evaluation improvements</span></li>\n<li><span style=\"font-size: 10pt\">Participating in red teaming exercises to surface system vulnerabilities</span></li>\n<li><span style=\"font-size: 10pt\">Maintaining strict confidentiality and quality standards</span></li>\n</ul>\n<p><br></p>\n<p><span style=\"font-weight: bold\">What We’re Looking For</span></p>\n<p><span style=\"font-size: 10pt\">Successful candidates are detail-oriented, analytically strong, and experienced in evaluating or stress-testing AI systems in complex or high-risk scenarios.</span></p>\n<p><br><br></p>\n<p><span style=\"font-size: 10pt\"><span style=\"font-weight: bold\">Professional experience in one or more of the following:</span></span></p>\n<ul>\n<li><span style=\"font-size: 10pt\">LLM red teaming or AI safety evaluation</span></li>\n<li><span style=\"font-size: 10pt\">Trust &amp; safety, content moderation, or policy enforcement</span></li>\n<li><span style=\"font-size: 10pt\">AI/ML evaluation, annotation, or QA workflows</span></li>\n<li><span style=\"font-size: 10pt\">Conversational analysis or behavioral risk assessment</span></li>\n<li><span style=\"font-size: 10pt\">Work involving sensitive or high-stakes user interactions</span></li>\n</ul>\n<p><span style=\"font-size: 10pt\"><span style=\"font-weight: bold\">Strong understanding of:</span></span></p>\n<ul>\n<li><span style=\"font-size: 10pt\">AI safety principles and common failure modes</span></li>\n<li><span style=\"font-size: 10pt\">Behavioral risk, escalation patterns, and edge-case handling</span></li>\n<li><span style=\"font-size: 10pt\">Mental health sensitivity, boundaries, and responsible AI behavior</span></li>\n<li><span style=\"font-size: 10pt\">How users express distress, confusion, or harmful intent in conversation</span></li>\n</ul>\n<p><span style=\"font-size: 10pt\"><span style=\"font-weight: bold\">Ability to identify:</span></span></p>\n<ul>\n<li><span style=\"font-size: 10pt\">Safety violations and policy gaps</span></li>\n<li><span style=\"font-size: 10pt\">Missed or mishandled risk signals</span></li>\n<li><span style=\"font-size: 10pt\">Unsafe, misleading, or overconfident responses</span></li>\n<li><span style=\"font-size: 10pt\">Inappropriate tone or boundary-setting</span></li>\n<li><span style=\"font-size: 10pt\">Failures in escalation, de-escalation, or resolution</span></li>\n<li><span style=\"font-size: 10pt\">Inconsistencies across multi-turn interactions</span></li>\n</ul>\n<p><span style=\"font-size: 10pt\"><span style=\"font-weight: bold\">Experience with or Interest in:</span></span></p>\n<ul>\n<li><span style=\"font-size: 10pt\">Red teaming methodologies and adversarial testing</span></li>\n<li><span style=\"font-size: 10pt\">Evaluating conversational AI systems or chatbots</span></li>\n<li><span style=\"font-size: 10pt\">Developing or applying evaluation frameworks and rubrics</span></li>\n<li><span style=\"font-size: 10pt\">Understanding how AI systems perform under real user behavior</span></li>\n</ul>\n<p><span style=\"font-size: 10pt\"><span style=\"font-weight: bold\">Comfort with:</span></span></p>\n<ul>\n<li><span style=\"font-size: 10pt\">Tech tools and platforms (Slack, spreadsheets, dashboards)</span></li>\n<li><span style=\"font-size: 10pt\">Evaluating AI-generated responses (no coding required, but must be tech-comfortable)</span></li>\n<li><span style=\"font-size: 10pt\">Ambiguity, iteration, and feedback-driven workflows</span></li>\n</ul>\n<p><span style=\"font-size: 10pt\"><span style=\"font-weight: bold\">Willingness to:</span></span></p>\n<ul>\n<li><span style=\"font-size: 10pt\">Sign NDAs and work with sensitive or high-impact content</span></li>\n</ul>\n<p><span style=\"font-size: 10pt\"><span style=\"font-weight: bold\">Nice to Have (Not Required)</span></span></p>\n<ul>\n<li><span style=\"font-size: 10pt\">Background in mental health, behavioral science, or psychology</span></li>\n<li><span style=\"font-size: 10pt\">Experience in QA, annotation, or qualitative analysis</span></li>\n<li><span style=\"font-size: 10pt\">Experience with AI systems in sensitive domains (e.g., healthcare, safety)</span></li>\n<li><span style=\"font-size: 10pt\">Familiarity with evaluation metrics or safety frameworks</span></li>\n</ul>\n<p><br></p>\n<p><span style=\"font-size: 10pt; font-weight: bold\">Compensation</span></p>\n<p><span style=\"font-size: 10pt\">$30-60/hour, depending on experience and specific project tasks/difficulty</span></p>",
    "compensation": "$30–$40/hour depending on project difficulty",
    "departmentId": "18634",
    "locationType": "1",
    "seekPromoted": false,
    "jobCategoryId": null,
    "jobOpeningName": "Red Teaming Expert",
    "departmentLabel": "Experts",
    "jobOpeningStatus": "Open",
    "minimumExperience": "Mid-level",
    "jobOpeningShareUrl": "https://mpathic2.bamboohr.com/careers/74",
    "employmentStatusLabel": "Contractor"
  }
}
Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/2e73108d54307d25a65ea85208ee08ea7e7850b9?include=descriptionJSON
GET https://api.bluedoor.sh/job-postings/v1/orgs/a49c47a0-5ae6-4084-89b9-187ae791ed8bJSON
GET https://api.bluedoor.sh/job-postings/v1/sources/b1af6ab1-26b4-4778-a1f4-d8ae41a6f240JSON
GET https://api.bluedoor.sh/job-postings/v1/jobs/2e73108d54307d25a65ea85208ee08ea7e7850b9/eventsJSON