bluedoor data·Job Postings API·bluedoor.sh ↗

HomeCompaniesIfm UsFoundation Model DevOps Engineer

Foundation Model DevOps Engineer

Ifm Us · Sunnyvale, CA · On Site · Deleted · $150,000–$350,000 / year · Lever

Job facts

FieldValue
CompanyIfm Us
TitleFoundation Model DevOps Engineer
Normalized title-
Department / teamProduct
LocationSunnyvale, CA, United States
Work modelOn Site
Employment typeFull Time
Salary$150,000–$350,000 / year
Statusdeleted
ATS providerLever
Posted / first seen2026-01-16 / 2026-05-29
Changed / last seen2026-06-03 / 2026-06-01

Related slices

PageWhat it containsOpen
Company jobsActive postings from Ifm Us.Open
Company breakdownsRole, location, ATS, and work model facets for this company.Open
ATS provider jobsActive postings observed through Lever.Open
Provider filtered searchThe same provider as a filtered job collection.Open
City jobsActive postings in Sunnyvale.Open
Work model jobsActive On Site postings.Open
Lifecycle eventsOpen, update, close, and reopen events for this posting.Open
Original postingCanonical source or apply URL captured from the ATS.Open

Linked records

CompanyIfm Us
Source4d111a77-38db-4b88-84a8-24f761a495a9
ATS providerLever

Description

About the Institute of Foundation Models We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to advance research, nurture the next generation of AI builders, and drive transformative contributions to a knowledge-driven economy. As part of our team, you’ll have the opportunity to work on the core of cutting-edge foundation model training, alongside world-class researchers, data scientists, and engineers, tackling the most fundamental and impactful challenges in AI development. You will participate in the development of groundbreaking AI solutions that have the potential to reshape entire industries. Strategic and innovative problem-solving skills will be instrumental in establishing MBZUAI as a global hub for high-performance computing in deep learning, driving impactful discoveries that inspire the next generation of AI pioneers. About the Institute of Foundation Models We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to advance research, nurture the next generation of AI builders, and drive transformative contributions to a knowledge-driven economy. As part of our team, you’ll have the opportunity to work on the core of cutting-edge foundation model training, alongside world-class researchers, data scientists, and engineers, tackling the most fundamental and impactful challenges in AI development. You will participate in the development of groundbreaking AI solutions that have the potential to reshape entire industries. Strategic and innovative problem-solving skills will be instrumental in establishing MBZUAI as a global hub for high-performance computing in deep learning, driving impactful discoveries that inspire the next generation of AI pioneers. The Role We are seeking a Foundation Model DevOps Engineer focused on Operational Stability to serve as the backbone of our AI research infrastructure. You will be designing the friction-free environment that allows our models to be built. Your mandate is to build the tooling, release pipelines, and storage policies that remove drag on our research team. You will own the "foundational layer", ensuring that our researchers have immediate, secure, and reliable access to the tools, data, and compute they need. Key Responsibilities Model Release Engineering ·      High-Fidelity Release Management: You own the standard of our public presence. You ensure that every release (weights, code, training logs, data) is reproducible, meticulously documented, and packaged with the polish of a top-tier open-source product. CI/CD for Research: Design and implement pipelines that automate the testing and packaging of complex model releases, moving us away from manual handovers to automated verification. ·      Repo Administration: Administer the organization’s GitHub Enterprise account, ensuring branch protection and clean versioning practices are enforced across the lab. Resource Management & Infrastructure Efficiency ·      Compute Governance: Manage the efficiency of our large-scale GPU resources. You track utilization to identify idle nodes, "zombie jobs," or inefficient scheduling, ensuring we extract maximum value from our compute clusters. ·      Storage Strategy & Hygiene: Manage the lifecycle of petabyte-scale datasets and checkpoint storage. You implement intelligent aging policies to solve the "disk full" bottleneck without risking critical data loss. ·      Quota & Access Logic: Proactively manage storage and compute quotas across research teams to prevent resource contention before it blocks a training run. Research Tooling & Orchestration ·       Experiment Management Systems: Build and maintain the internal CLI tools and dashboards that allow researchers to launch, track, and organize jobs across thousands of GPUs. ·       Resource Telemetry: Set up real-time monitoring for interconnect throughput, GPU memory, and file system latency to catch performance degradation instantly. ·       Job Orchestration: Work closely with infrastructure teams to optimize how we run synthetic data pipelines and large-scale evaluations, ensuring our tooling scales with our compute. Research Environment Provisioning ·       Automated Workspace Setup: Build the scripts and tooling that instantly provision compute environments, permissions, and storage namespaces for researchers (automating away the manual work). ·       Cluster Access Architecture: Streamline SSH and node access protocols to ensure friction-free entry to our massive-scale compute clusters while maintaining security boundaries. Academic Qualifications A bachelor’s degree in Computer Science, Information Technology, or a related field, or equivalent practical experience. Professional Experience - Minimum (The Bar) ·      3+ years of experience in DevOps, Release Engineering, or MLE, specifically within AI/ML or HPC environments. ·      Foundation Model Fluency: You understand the lifecycle of training large models (LLMs or Diffusion). You know what a checkpoint is, you understand the difference between pre-training and inference, and you are familiar with the artifacts required for a model release. ·      Linux/Unix Fluency: You live in the command line. You have deep expertise in bash scripting, file system permissions, and SSH configuration. ·      Version Control Admin: Expert-level administration of GitHub Enterprise (managing teams, API limits, and repository security). ·      Scripting & Automation: Proficiency in Python or Bash to automate repetitive administrative tasks. Professional Experience - Preferred (The Fit) ·       "Gold Standard" Open Source: Experience contributing to or managing high-profile open-source releases (Hugging Face libraries, model families, datasets). ·       HPC Schedulers: Deep understanding of Slurm job scheduling and troubleshooting. ·       Cloud Storage: Familiarity with cloud storage buckets (S3/GCP) and efficient data transfer tools. Visa Sponsorship This position is eligible for visa sponsorship. Benefits Include *Comprehensive medical, dental, and vision benefits  *Bonus *401K Plan *Generous paid time off, sick leave and holidays *Paid Parental Leave *Employee Assistance Program *Life insurance and disability

Full job record

Job ID2917dd291327fefaf5a804a08ef5bdd6e69e9e6f
Org IDbb7fb7ce-62b9-4ed3-9327-02a3c7b7e5d0
Source ID4d111a77-38db-4b88-84a8-24f761a495a9
Board ID4d111a77-38db-4b88-84a8-24f761a495a9
Providerlever
Provider Job Keyef55ac40-dd7d-4e94-b212-2601d62458e4
TitleFoundation Model DevOps Engineer
Normalized Title
Statusdeleted
Activeno
Location TextSunnyvale, CA
Department
TeamProduct
Employment TypeFull-time
Workplace Typeon_site
Remote Policy
CountryUnited States
RegionCA
CitySunnyvale
Salary RawUSD 150000-350000 per-year-salary
Salary Min150,000
Salary Max350,000
Salary CurrencyUSD
Salary Periodyear
Source URLhttps://jobs.lever.co/ifm-us/ef55ac40-dd7d-4e94-b212-2601d62458e4
Apply URLhttps://jobs.lever.co/ifm-us/ef55ac40-dd7d-4e94-b212-2601d62458e4/apply
First Seen At2026-05-29 06:59:53Z
Last Seen At2026-06-01 10:58:05Z
Last Checked At2026-06-03 12:27:19Z
Last Changed At2026-06-03 12:27:19Z
Inactive At2026-06-03 12:27:19Z
Source Posted At2026-01-16 21:27:32Z
Source Updated At
Raw Payload Uris3://bluework-jobs-prod-raw-590183727216/raw/provider=lever/board=ifm-us/date=2026-06-01/2026-06-01T10-58-04-347Z-e0995ade570b11d5d44c124e2705dfedbfd8f49b2d4d6a9e9dd417533da7da65.json
Event Fields
{
  "content_hash": "18ef6a864273855e63d8b09b4cb26217cdc3002b08c9177697f68fe3e3878973",
  "source_hash": "2dd6bd778ea1f7ef5886c37ee8194843eb75924230b88675fb5e520c408e8968",
  "last_changed_at": "2026-06-03T12:27:19.440Z",
  "active_status": "deleted"
}
Parsed Structured
{
  "language": "en",
  "location": {
    "raw": "Sunnyvale, CA",
    "city": "Sunnyvale",
    "region": "CA",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.9
  },
  "salary_max": 350000,
  "salary_min": 150000,
  "inferred_at": "2026-06-01T10:58:05.168Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en",
    "location": {
      "raw": "Sunnyvale, CA",
      "city": "Sunnyvale",
      "region": "CA",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.9
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": null,
  "salary_period": "year",
  "workplace_type": "on_site",
  "salary_currency": "USD"
}
Extensions
{}
Native Structured
{
  "lists": [],
  "country": "US",
  "createdAt": 1768598852367,
  "updatedAt": null,
  "categories": {
    "team": "Product",
    "location": "Sunnyvale, CA",
    "commitment": "Full-time",
    "allLocations": [
      "Sunnyvale, CA"
    ]
  },
  "salaryRange": {
    "max": 350000,
    "min": 150000,
    "currency": "USD",
    "interval": "per-year-salary"
  },
  "workplaceType": "onsite"
}
Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/2917dd291327fefaf5a804a08ef5bdd6e69e9e6f?include=descriptionJSON
GET https://api.bluedoor.sh/job-postings/v1/orgs/bb7fb7ce-62b9-4ed3-9327-02a3c7b7e5d0JSON
GET https://api.bluedoor.sh/job-postings/v1/sources/4d111a77-38db-4b88-84a8-24f761a495a9JSON
GET https://api.bluedoor.sh/job-postings/v1/jobs/2917dd291327fefaf5a804a08ef5bdd6e69e9e6f/eventsJSON