Home › Companies › Focuskpi › AI Infrastructure & Experience Engineer
AI Infrastructure & Experience Engineer
Focuskpi · Mountain View, CA · On Site · Active · $70 / hour · JazzHR / ApplyToJob
Job facts
| Field | Value |
|---|---|
| Company | Focuskpi |
| Title | AI Infrastructure & Experience Engineer |
| Normalized title | - |
| Department / team | - |
| Location | Mountain View, CA, United States |
| Work model | On Site |
| Employment type | Contract |
| Salary | $70 / hour |
| Status | active |
| ATS provider | JazzHR / ApplyToJob |
| Posted / first seen | 2026-06-05 / 2026-06-06 |
| Changed / last seen | 2026-06-06 / 2026-06-06 |
Related slices
| Page | What it contains | Open |
|---|---|---|
| Company jobs | Active postings from Focuskpi. | Open |
| Company breakdowns | Role, location, ATS, and work model facets for this company. | Open |
| ATS provider jobs | Active postings observed through JazzHR / ApplyToJob. | Open |
| Provider filtered search | The same provider as a filtered job collection. | Open |
| City jobs | Active postings in Mountain View. | Open |
| Work model jobs | Active On Site postings. | Open |
| Lifecycle events | Open, update, close, and reopen events for this posting. | Open |
| Original posting | Canonical source or apply URL captured from the ATS. | Open |
Linked records
| Company | Focuskpi |
| Source | 1572cdfc-9f3a-49b3-bcce-c9f7728afdc1 |
| ATS provider | JazzHR / ApplyToJob |
Description
FocusKPI is seeking an AI Infrastructure & Experience Engineer to join one of our clients, a high-tech SaaS company.
Work Location: Mountain View, CA (Onsite role, 5 days/week onsite)
Duration: 4-month contract
Pay Range: $70 - 79/hr
**No C2C resumes are considered**
Position Responsibilities:
Inference Optimization: Deploy and tune multiple LLMs and generative multimodal models on local inference hardware. Optimize performance metrics (TTFT, tokens/sec) via model quantization, caching strategies, and architecture-specific adjustments. Systems Engineering & CUDA: Leverage deep knowledge of the CUDA environment to build custom kernels, ensuring maximum utilization of the low-cost GPU compute. Orchestration & Integration: Seamlessly bridge inference backends with orchestration layers (LiteLLM, Ollama, etc.) and frontends like OpenWebUI. Rapid Prototyping: Build functional, high-fidelity demos showcasing model memory capabilities, agentic workflows, and context-aware web search. Peripheral Connectivity: Implement communication protocols to bridge local AI compute with peripheral devices, including smart TVs, household appliances, and XR hardware. Requirements/Technical qualifications: Recent experience in model optimization is required Hardware & Compute: Proven experience with NVIDIA ecosystems and ARM64 architecture. Systems Programming: Advanced proficiency in C++, Python, and Rust. Deep familiarity with CUDA and the ability to author/debug custom CUDA kernels for compute-intensive tasks. AI/ML Frameworks: Extensive experience with modern inference engines (llama.cpp, TensorRT-LLM, Ollama) and orchestration frameworks (LiteLLM). Software Engineering: Robust understanding of asynchronous programming (FastAPI), containerization (Docker/Kubernetes), sandbox environments, and API design for low-latency communication. Full-Stack Prototyping: Ability to quickly spin up modern frontend UIs (React, Next.js, or similar) to present AI-driven intelligence to end users. Communication Protocols: Familiarity with WebSockets, gRPC, and REST for device-to-device communication in a local network environment. Overall Mandatory skills required: Model optimization recent exparience, Interference Optimization, NVIDIA ecosystems, Custom CUDA Kernel Development, ARM64 architecture, Python Ideal Candidate Profile: A minimum of 3 years of relevant industry experience is required The "Builder" Mindset: You are energized by the prospect of building proofs-of-concept in days rather than months. You thrive in environments where speed and creativity are paramount. Problem Solver: You approach unsolved, messy engineering challenges with enthusiasm rather than trepidation. Architectural Vision: You see the "big picture" of how AI becomes part of consumers' daily lives, not just how the model generates text. Agile & Adaptable: You are comfortable working in a fast-paced environment where priorities shift based on the results of rapid experimentation. Degree in Computer Science, Machine Learning, or Artificial Intelligence Specialization preferred, but not required
**No C2C resumes are considered**
Thank you!
FocusKPI Hiring Team
Founded in 2010, FocusKPI, Inc. (FocusKPI) is a data science and technology firm specializing in predictive analytics practice and methodologies. FocusKPI is a US company headquartered in Silicon Valley, California, with an East Coast office in Boston, Massachusetts.
NOTICE: Please be aware of fraudulent emails regarding job postings, job offers and fake checks. FocusKPI's recruiting team will strictly reach out via @focuskpi.com email domain. If you have received fraudulent emails now or in the past, please report it to https://reportfraud.ftc.gov/ .
The domain @focuskpijobs.com is fraudulent and not related to FocusKPI. Please do not not reply or communicate to anyone with @focuskpijobs.com.
Full job record
| Job ID | 595545ecc9507b31f649a9e675efd641b9179f0d |
| Org ID | 32d4f88d-09f2-4ba9-a5c4-3a2080cdf9ee |
| Source ID | 1572cdfc-9f3a-49b3-bcce-c9f7728afdc1 |
| Board ID | 1572cdfc-9f3a-49b3-bcce-c9f7728afdc1 |
| Provider | jazzhr |
| Provider Job Key | kRKV4324ft |
| Title | AI Infrastructure & Experience Engineer |
| Normalized Title | — |
| Status | active |
| Active | yes |
| Location Text | Mountain View, CA |
| Department | — |
| Team | — |
| Employment Type | contract |
| Workplace Type | on_site |
| Remote Policy | — |
| Country | United States |
| Region | CA |
| City | Mountain View |
| Salary Raw | Pay Range: $70 - 79/hr **No C2C resumes are considered** Position Responsibilities: Inference |
| Salary Min | 70 |
| Salary Max | — |
| Salary Currency | USD |
| Salary Period | hour |
| Source URL | https://focuskpi.applytojob.com/apply/kRKV4324ft/AI-Infrastructure-Experience-Engineer |
| Apply URL | https://focuskpi.applytojob.com/apply/kRKV4324ft/AI-Infrastructure-Experience-Engineer |
| First Seen At | 2026-06-06 10:34:58Z |
| Last Seen At | 2026-06-06 19:21:46Z |
| Last Checked At | 2026-06-06 19:21:46Z |
| Last Changed At | 2026-06-06 10:34:58Z |
| Inactive At | — |
| Source Posted At | 2026-06-05 00:00:00Z |
| Source Updated At | — |
| Raw Payload Uri | s3://job-postings-prod-raw-590183727216/raw/provider=jazzhr/board=focuskpi/date=2026-06-06/2026-06-06T19-21-45-813Z-d49ee75f35ae62f88e8387ea1bd57d66b1bd634f5559945fbec219372c64b207.json |
Event Fields
{
"content_hash": "d438b61818e684d8fd2d68161ad061950c5c8f770856a86812c07b744d062a82",
"source_hash": "54be4e3afc6aba1c7ebc383af1c11354d413222c716564fe49953d69f158aefb",
"last_changed_at": "2026-06-06T10:34:58.083Z",
"active_status": "active"
}Parsed Structured
{
"language": "en",
"location": {
"raw": "Mountain View, CA",
"city": "Mountain View",
"region": "CA",
"country": "United States",
"is_remote": false,
"confidence": 0.9
},
"salary_max": null,
"salary_min": 70,
"inferred_at": "2026-06-06T19:21:46.150Z",
"launch_scope": {
"reason": "jazzhr_production_catalog",
"included": true,
"location": {
"raw": "Mountain View, CA",
"city": "Mountain View",
"region": "CA",
"country": "United States",
"is_remote": false,
"confidence": 0.9
},
"countries": [
"United States"
]
},
"remote_policy": null,
"salary_period": "hour",
"workplace_type": "on_site",
"salary_currency": "USD"
}Extensions
{}Native Structured
{
"detail": {
"url": "https://focuskpi.applytojob.com/apply/jobs/details/kRKV4324ft?&",
"heading": "AI Infrastructure & Experience Engineer",
"html_title": "JazzHR » Job Listings",
"canonical_url": "https://focuskpi.applytojob.com/apply/kRKV4324ft/AI-Infrastructure-Experience-Engineer",
"description_html": "<p>FocusKPI is seeking an <strong>AI Infrastructure & Experience Engineer </strong>to join one of our clients, a high-tech SaaS company. </p><p><strong><u>Work Location:</u> </strong>Mountain View, CA (Onsite role, 5 days/week onsite)<br><u><strong>Duration:</strong></u> 4-month contract <br><strong><u>Pay Range:</u> </strong>$70 - 79/hr<br><br><u><strong>**No C2C resumes are considered**</strong></u><br> </p><p><u><strong>Position Responsibilities:</strong></u></p><ul><li><strong>Inference Optimization:</strong> Deploy and tune multiple LLMs and generative multimodal models on local inference hardware. Optimize performance metrics (TTFT, tokens/sec) via model quantization, caching strategies, and architecture-specific adjustments.</li><li><strong>Systems Engineering & CUDA:</strong> Leverage deep knowledge of the CUDA environment to build custom kernels, ensuring maximum utilization of the low-cost GPU compute.</li><li><strong>Orchestration & Integration:</strong> Seamlessly bridge inference backends with orchestration layers (LiteLLM, Ollama, etc.) and frontends like OpenWebUI.</li><li><strong>Rapid Prototyping:</strong> Build functional, high-fidelity demos showcasing model memory capabilities, agentic workflows, and context-aware web search.</li><li><strong>Peripheral Connectivity:</strong> Implement communication protocols to bridge local AI compute with peripheral devices, including smart TVs, household appliances, and XR hardware.</li></ul><u><strong>Requirements/Technical qualifications:</strong></u><ul><li>Recent experience in<strong> model optimization is required</strong></li><li><strong>Hardware & Compute:</strong> Proven experience with NVIDIA ecosystems and ARM64 architecture.</li><li><strong>Systems Programming:</strong> Advanced proficiency in C++, Python, and Rust. Deep familiarity with CUDA and the ability to author/debug custom CUDA kernels for compute-intensive tasks.</li><li><strong>AI/ML Frameworks:</strong> Extensive experience with modern inference engines (llama.cpp, TensorRT-LLM, Ollama) and orchestration frameworks (LiteLLM).</li><li><strong>Software Engineering:</strong> Robust understanding of asynchronous programming (FastAPI), containerization (Docker/Kubernetes), sandbox environments, and API design for low-latency communication.</li><li><strong>Full-Stack Prototyping:</strong> Ability to quickly spin up modern frontend UIs (React, Next.js, or similar) to present AI-driven intelligence to end users.</li><li><strong>Communication Protocols:</strong> Familiarity with WebSockets, gRPC, and REST for device-to-device communication in a local network environment.</li><li><strong>Overall Mandatory skills required:</strong> Model optimization recent exparience, Interference Optimization, NVIDIA ecosystems, Custom CUDA Kernel Development, ARM64 architecture, Python</li></ul><u><strong>Ideal Candidate Profile:</strong></u><ul><li><strong>A minimum of 3 years of relevant industry experience is required</strong></li><li><strong>The \"Builder\" Mindset:</strong> You are energized by the prospect of building proofs-of-concept in days rather than months. You thrive in environments where speed and creativity are paramount.</li><li><strong>Problem Solver:</strong> You approach unsolved, messy engineering challenges with enthusiasm rather than trepidation.</li><li><strong>Architectural Vision:</strong> You see the \"big picture\" of how AI becomes part of consumers' daily lives, not just how the model generates text.</li><li><strong>Agile & Adaptable:</strong> You are comfortable working in a fast-paced environment where priorities shift based on the results of rapid experimentation.</li><li>Degree in<strong> Computer Science, Machine Learning, or Artificial Intelligence Specialization </strong>preferred, but not required</li></ul><br><u><strong>**No C2C resumes are considered**</strong></u><br> <p>Thank you!</p><p>FocusKPI Hiring Team</p><p>Founded in 2010, FocusKPI, Inc. (FocusKPI) is a data science and technology firm specializing in predictive analytics practice and methodologies. FocusKPI is a US company headquartered in Silicon Valley, California, with an East Coast office in Boston, Massachusetts.</p><p> </p>\n\n<p><strong>NOTICE: </strong>Please be aware of fraudulent emails regarding job postings, job offers and fake checks. FocusKPI's recruiting team will strictly reach out via @focuskpi.com email domain. If you have received fraudulent emails now or in the past, please report it to <a href=\\\"https://reportfraud.ftc.gov/\\\">https://reportfraud.ftc.gov/</a> .<br />\nThe domain @focuskpijobs.com is fraudulent and not related to FocusKPI. Please do not not reply or communicate to anyone with @focuskpijobs.com.</p>",
"description_text": "FocusKPI is seeking an AI Infrastructure & Experience Engineer to join one of our clients, a high-tech SaaS company.\n Work Location: Mountain View, CA (Onsite role, 5 days/week onsite)\n Duration: 4-month contract\n Pay Range: $70 - 79/hr\n **No C2C resumes are considered**\n Position Responsibilities:\n Inference Optimization: Deploy and tune multiple LLMs and generative multimodal models on local inference hardware. Optimize performance metrics (TTFT, tokens/sec) via model quantization, caching strategies, and architecture-specific adjustments.\n Systems Engineering & CUDA: Leverage deep knowledge of the CUDA environment to build custom kernels, ensuring maximum utilization of the low-cost GPU compute.\n Orchestration & Integration: Seamlessly bridge inference backends with orchestration layers (LiteLLM, Ollama, etc.) and frontends like OpenWebUI.\n Rapid Prototyping: Build functional, high-fidelity demos showcasing model memory capabilities, agentic workflows, and context-aware web search.\n Peripheral Connectivity: Implement communication protocols to bridge local AI compute with peripheral devices, including smart TVs, household appliances, and XR hardware.\n Requirements/Technical qualifications: Recent experience in model optimization is required\n Hardware & Compute: Proven experience with NVIDIA ecosystems and ARM64 architecture.\n Systems Programming: Advanced proficiency in C++, Python, and Rust. Deep familiarity with CUDA and the ability to author/debug custom CUDA kernels for compute-intensive tasks.\n AI/ML Frameworks: Extensive experience with modern inference engines (llama.cpp, TensorRT-LLM, Ollama) and orchestration frameworks (LiteLLM).\n Software Engineering: Robust understanding of asynchronous programming (FastAPI), containerization (Docker/Kubernetes), sandbox environments, and API design for low-latency communication.\n Full-Stack Prototyping: Ability to quickly spin up modern frontend UIs (React, Next.js, or similar) to present AI-driven intelligence to end users.\n Communication Protocols: Familiarity with WebSockets, gRPC, and REST for device-to-device communication in a local network environment.\n Overall Mandatory skills required: Model optimization recent exparience, Interference Optimization, NVIDIA ecosystems, Custom CUDA Kernel Development, ARM64 architecture, Python\n Ideal Candidate Profile: A minimum of 3 years of relevant industry experience is required\n The \"Builder\" Mindset: You are energized by the prospect of building proofs-of-concept in days rather than months. You thrive in environments where speed and creativity are paramount.\n Problem Solver: You approach unsolved, messy engineering challenges with enthusiasm rather than trepidation.\n Architectural Vision: You see the \"big picture\" of how AI becomes part of consumers' daily lives, not just how the model generates text.\n Agile & Adaptable: You are comfortable working in a fast-paced environment where priorities shift based on the results of rapid experimentation.\n Degree in Computer Science, Machine Learning, or Artificial Intelligence Specialization preferred, but not required\n **No C2C resumes are considered**\n Thank you!\n FocusKPI Hiring Team\n Founded in 2010, FocusKPI, Inc. (FocusKPI) is a data science and technology firm specializing in predictive analytics practice and methodologies. FocusKPI is a US company headquartered in Silicon Valley, California, with an East Coast office in Boston, Massachusetts.\n NOTICE: Please be aware of fraudulent emails regarding job postings, job offers and fake checks. FocusKPI's recruiting team will strictly reach out via @focuskpi.com email domain. If you have received fraudulent emails now or in the past, please report it to https://reportfraud.ftc.gov/ .\nThe domain @focuskpijobs.com is fraudulent and not related to FocusKPI. Please do not not reply or communicate to anyone with @focuskpijobs.com.",
"jsonld_jobposting": {
"url": "https://focuskpi.applytojob.com/apply/kRKV4324ft/AI-Infrastructure-Experience-Engineer",
"@type": "JobPosting",
"title": "AI Infrastructure & Experience Engineer",
"@context": "http://schema.org/",
"datePosted": "2026-06-05",
"description": "<p>FocusKPI is seeking an <strong>AI Infrastructure & Experience Engineer </strong>to join one of our clients, a high-tech SaaS company. </p><p><strong><u>Work Location:</u> </strong>Mountain View, CA (Onsite role, 5 days/week onsite)<br><u><strong>Duration:</strong></u> 4-month contract <br><strong><u>Pay Range:</u> </strong>$70 - 79/hr<br><br><u><strong>**No C2C resumes are considered**</strong></u><br> </p><p><u><strong>Position Responsibilities:</strong></u></p><ul><li><strong>Inference Optimization:</strong> Deploy and tune multiple LLMs and generative multimodal models on local inference hardware. Optimize performance metrics (TTFT, tokens/sec) via model quantization, caching strategies, and architecture-specific adjustments.</li><li><strong>Systems Engineering & CUDA:</strong> Leverage deep knowledge of the CUDA environment to build custom kernels, ensuring maximum utilization of the low-cost GPU compute.</li><li><strong>Orchestration & Integration:</strong> Seamlessly bridge inference backends with orchestration layers (LiteLLM, Ollama, etc.) and frontends like OpenWebUI.</li><li><strong>Rapid Prototyping:</strong> Build functional, high-fidelity demos showcasing model memory capabilities, agentic workflows, and context-aware web search.</li><li><strong>Peripheral Connectivity:</strong> Implement communication protocols to bridge local AI compute with peripheral devices, including smart TVs, household appliances, and XR hardware.</li></ul><u><strong>Requirements/Technical qualifications:</strong></u><ul><li>Recent experience in<strong> model optimization is required</strong></li><li><strong>Hardware & Compute:</strong> Proven experience with NVIDIA ecosystems and ARM64 architecture.</li><li><strong>Systems Programming:</strong> Advanced proficiency in C++, Python, and Rust. Deep familiarity with CUDA and the ability to author/debug custom CUDA kernels for compute-intensive tasks.</li><li><strong>AI/ML Frameworks:</strong> Extensive experience with modern inference engines (llama.cpp, TensorRT-LLM, Ollama) and orchestration frameworks (LiteLLM).</li><li><strong>Software Engineering:</strong> Robust understanding of asynchronous programming (FastAPI), containerization (Docker/Kubernetes), sandbox environments, and API design for low-latency communication.</li><li><strong>Full-Stack Prototyping:</strong> Ability to quickly spin up modern frontend UIs (React, Next.js, or similar) to present AI-driven intelligence to end users.</li><li><strong>Communication Protocols:</strong> Familiarity with WebSockets, gRPC, and REST for device-to-device communication in a local network environment.</li><li><strong>Overall Mandatory skills required:</strong> Model optimization recent exparience, Interference Optimization, NVIDIA ecosystems, Custom CUDA Kernel Development, ARM64 architecture, Python</li></ul><u><strong>Ideal Candidate Profile:</strong></u><ul><li><strong>A minimum of 3 years of relevant industry experience is required</strong></li><li><strong>The \"Builder\" Mindset:</strong> You are energized by the prospect of building proofs-of-concept in days rather than months. You thrive in environments where speed and creativity are paramount.</li><li><strong>Problem Solver:</strong> You approach unsolved, messy engineering challenges with enthusiasm rather than trepidation.</li><li><strong>Architectural Vision:</strong> You see the \"big picture\" of how AI becomes part of consumers' daily lives, not just how the model generates text.</li><li><strong>Agile & Adaptable:</strong> You are comfortable working in a fast-paced environment where priorities shift based on the results of rapid experimentation.</li><li>Degree in<strong> Computer Science, Machine Learning, or Artificial Intelligence Specialization </strong>preferred, but not required</li></ul><br><u><strong>**No C2C resumes are considered**</strong></u><br> <p>Thank you!</p><p>FocusKPI Hiring Team</p><p>Founded in 2010, FocusKPI, Inc. (FocusKPI) is a data science and technology firm specializing in predictive analytics practice and methodologies. FocusKPI is a US company headquartered in Silicon Valley, California, with an East Coast office in Boston, Massachusetts.</p><p> </p>\n\n<p><strong>NOTICE: </strong>Please be aware of fraudulent emails regarding job postings, job offers and fake checks. FocusKPI's recruiting team will strictly reach out via @focuskpi.com email domain. If you have received fraudulent emails now or in the past, please report it to <a href=\\\"https://reportfraud.ftc.gov/\\\">https://reportfraud.ftc.gov/</a> .<br />\nThe domain @focuskpijobs.com is fraudulent and not related to FocusKPI. Please do not not reply or communicate to anyone with @focuskpijobs.com.</p>",
"jobLocation": {
"@type": "Place",
"address": {
"@type": "PostalAddress",
"postalCode": "",
"addressRegion": "CA",
"addressLocality": "Mountain View"
}
},
"validThrough": "2026-09-03",
"uniqueJobCode": "job_20260605202412_7AMWASQWIWZXENMD",
"employmentType": "CONTRACTOR",
"hiringOrganization": {
"logo": "https://s3.amazonaws.com/resumator/customer_20140207234534_9FZU4QPSBXKUCYMT/logos/20240613191817_HORIZONTAL_LOGO.jpg",
"name": "FocusKPI Inc.",
"@type": "Organization",
"sameAs": "http://focuskpi.com"
},
"experienceRequirements": "Mid Level"
}
},
"list_job": {
"id": "kRKV4324ft",
"title": "AI Infrastructure & Experience Engineer",
"detailUrl": "https://focuskpi.applytojob.com/apply/jobs/details/kRKV4324ft?&"
},
"detail_errors": []
}Get this page with API
Rendered from the bluedoor Job Postings API. Reproduce it:
GET https://api.bluedoor.sh/job-postings/v1/jobs/595545ecc9507b31f649a9e675efd641b9179f0d?include=descriptionJSONGET https://api.bluedoor.sh/job-postings/v1/orgs/32d4f88d-09f2-4ba9-a5c4-3a2080cdf9eeJSONGET https://api.bluedoor.sh/job-postings/v1/sources/1572cdfc-9f3a-49b3-bcce-c9f7728afdc1JSONGET https://api.bluedoor.sh/job-postings/v1/jobs/595545ecc9507b31f649a9e675efd641b9179f0d/eventsJSON