Home › Companies › Saviynt › AI Platform Engineer, Training and Inference

AI Platform Engineer, Training and Inference

Saviynt · Hybrid · Deleted · Lever

Job facts

Field	Value
Company	Saviynt
Title	AI Platform Engineer, Training and Inference
Normalized title	-
Department / team	Engineering / Software Engineering
Location	United States
Work model	Hybrid / Hybrid
Employment type	Full Time
Salary	-
Status	deleted
ATS provider	Lever
Posted / first seen	2026-05-18 / 2026-05-29
Changed / last seen	2026-06-04 / 2026-06-02

Related slices

Page	What it contains	Open
Company jobs	Active postings from Saviynt.	Open
Company breakdowns	Role, location, ATS, and work model facets for this company.	Open
ATS provider jobs	Active postings observed through Lever.	Open
Provider filtered search	The same provider as a filtered job collection.	Open
Department jobs	Active postings in Engineering.	Open
Work model jobs	Active Hybrid postings.	Open
Lifecycle events	Open, update, close, and reopen events for this posting.	Open
Original posting	Canonical source or apply URL captured from the ATS.	Open

Linked records

Company	Saviynt
Source	322a8a04-9d59-41b4-a4b7-6ed15ff4cb29
ATS provider	Lever

Description

AI Platform Engineer – Training & Inference Saviynt's AI-powered identity platform manages and governs human and non-human access to all of an organization's applications, data, and business processes. Customers trust Saviynt to safeguard their digital assets, drive operational efficiency, and reduce compliance costs. Built for the AI age, Saviynt is today helping organizations safely accelerate their deployment and usage of AI. Saviynt is recognized as the leader in identity security, with solutions that protect and empower the world's leading brands, Fortune 500 companies and government institutions. For more information, please visit www.saviynt.com. The AI Platform team is building the compute layer that trains, evaluates, and serves every AI model at Saviynt. We need an ML Platform Engineer to own distributed training on Ray + H100s, the multi-engine LLM inference mesh (vLLM, SGLang, NVIDIA Triton), and the full model promotion lifecycle — from shadow mode through canary rollout to GA. The AI Platform team's mission is to build a secure, scalable, product-agnostic AI foundation that enables Saviynt's identity products to deliver measurable AI-powered outcomes. Training & Inference is the engine — it turns data into deployed models that make Saviynt's products smarter. What You Will Be Doing • Own the Ray ecosystem end-to-end: manage KubeRay on GKE, tune Ray Core Task/Actor scheduling, operate the Plasma distributed object store, and configure Ray Data for GPU-direct streaming from GCS/S3 • Operate distributed training with Ray Train: configure TorchTrainer + DDP/NCCL for multi-node H100 clusters, manage checkpoint lifecycle, implement spot-preemption recovery, and integrate warm-start fine-tuning for retrain pipelines • Build and operate the LLM inference mesh with Ray Serve: compose vLLM (PagedAttention), SGLang (RadixAttention), and NVIDIA Triton (TensorRT/ONNX) as a unified deployment graph with Plasma zero-copy memory sharing • Optimise inference performance: configure fractional GPU allocation, enable continuous batching, implement per-engine autoscaling based on request queue depth, and tune KV-cache block sizes • Design and operate the model routing layer: capability-based, version-based, and tenant-based routing with cost-aware fallback between self-hosted SLMs and cloud LLMs • Build RL training infrastructure: define Flyte workflows for RL pipelines (rollout, reward shaping, policy update, evaluation), integrate Ray RLlib or custom PPO/GRPO loops with Ray Train, and manage replay buffer persistence on GCS • Operate the full model promotion lifecycle: quality gate → integration tests → load tests (k6) → shadow mode → A/B gate → canary (10%→100%) with golden-signal auto-rollback • Operate the retrain pipeline: drift detection triggers, warm-start retraining, relative quality gates (V2 >= V1 − 2%), and automated Flyte DAG through to canary • Integrate RAG retrieval into the inference mesh: vector similarity search, context assembly, and prompt construction before LLM inference What You Bring • Experience in ML engineering with time in an ML platform or MLOps role • Production Ray depth: Ray Train, Serve, Core, and Data — debugged real production failures including NCCL timeouts, Plasma OOM, and Serve autoscaling lag • LLM serving engines: hands-on with vLLM, SGLang, or NVIDIA Triton — PagedAttention, prefix caching, and continuous batching tuned for latency/throughput targets • Distributed training: DDP, FSDP, NCCL collectives, gradient checkpointing, and mixed precision (BF16/FP8) • RL working knowledge: PPO, policy gradient, or RLHF — able to translate an algorithm into distributed compute primitives • Model lifecycle operations: MLflow registry, shadow/A/B/canary patterns, and auto- rollback on golden signal degradation • Vector databases: Pgvector or Qdrant — ANN index strategies, embedding upsert, and query latency tuning under inference load • Strong Python and PyTorch; Flyte or equivalent ML orchestrator • Quantization (nice to have): INT8/INT4/FP8 post-training quantization (GPTQ, AWQ, or bitsandbytes) • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience or equivalent military experience We offer you a competitive total rewards package, learning and tremendous opportunities to grow and advance in your career. At Saviynt, it is not typical for an individual to be hired at or near the top of the range for their role and final compensation decisions are dependent on many factors including, but not limited to location; skill sets; experience and training; licensure and certifications; and other relevant business and organizational needs. You may also be eligible to participate in a Saviynt discretionary bonus plan, subject to the rules governing the program, whereby an award, if any, depends on various factors, including, without limitation, individual and organizational performance.

Full job record

Job ID	e6d884fe650d29253657100e3d0c4859570c9873
Org ID	4d1706c8-4181-4263-b289-dabbcc3bace3
Source ID	322a8a04-9d59-41b4-a4b7-6ed15ff4cb29
Board ID	322a8a04-9d59-41b4-a4b7-6ed15ff4cb29
Provider	lever
Provider Job Key	9a8661ce-8856-4977-87f4-b06567125e28
Title	AI Platform Engineer, Training and Inference
Normalized Title	—
Status	deleted
Active	no
Location Text	—
Department	Engineering
Team	Software Engineering
Employment Type	Full-Time
Workplace Type	hybrid
Remote Policy	hybrid
Country	United States
Region	—
City	—
Salary Raw	—
Salary Min	—
Salary Max	—
Salary Currency	—
Salary Period	—
Source URL	https://jobs.lever.co/saviynt/9a8661ce-8856-4977-87f4-b06567125e28
Apply URL	https://jobs.lever.co/saviynt/9a8661ce-8856-4977-87f4-b06567125e28/apply
First Seen At	2026-05-29 07:00:50Z
Last Seen At	2026-06-02 10:40:18Z
Last Checked At	2026-06-04 11:27:11Z
Last Changed At	2026-06-04 11:27:11Z
Inactive At	2026-06-04 11:27:11Z
Source Posted At	2026-05-18 17:13:41Z
Source Updated At	—
Raw Payload Uri	s3://bluework-jobs-prod-raw-590183727216/raw/provider=lever/board=saviynt/date=2026-06-02/2026-06-02T10-40-17-373Z-01797be889940128d417a21e09277ac79320870ee0e2aa601fe43803358d63de.json

Event Fields

{
  "content_hash": "9198b30fc1ad27ca6b3a71a0a1b64823f5f054b61ac29b7f70239afb3be0541b",
  "source_hash": "8c6e969581829b1dd2fa33953b176b969a0c7a576753e52ddc70c5509077084f",
  "last_changed_at": "2026-06-04T11:27:11.538Z",
  "active_status": "deleted"
}

Parsed Structured

{
  "language": "en",
  "location": {
    "raw": null,
    "city": null,
    "region": null,
    "country": "United States",
    "is_remote": false,
    "confidence": 0.85
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-02T10:40:18.691Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en",
    "location": {
      "raw": null,
      "city": null,
      "region": null,
      "country": "United States",
      "is_remote": false,
      "confidence": 0.85
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": "hybrid",
  "salary_period": null,
  "workplace_type": "hybrid",
  "salary_currency": null
}

Extensions

{}

Native Structured

{
  "lists": [],
  "country": "US",
  "createdAt": 1779124421003,
  "updatedAt": null,
  "categories": {
    "team": "Software Engineering",
    "commitment": "Full-Time",
    "department": "Engineering",
    "allLocations": []
  },
  "salaryRange": null,
  "workplaceType": "hybrid"
}

Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/e6d884fe650d29253657100e3d0c4859570c9873?include=descriptionJSON

GET https://api.bluedoor.sh/job-postings/v1/orgs/4d1706c8-4181-4263-b289-dabbcc3bace3JSON

GET https://api.bluedoor.sh/job-postings/v1/sources/322a8a04-9d59-41b4-a4b7-6ed15ff4cb29JSON

GET https://api.bluedoor.sh/job-postings/v1/jobs/e6d884fe650d29253657100e3d0c4859570c9873/eventsJSON

Docs · Get an API key