Home › Companies › Vespaai › Principal Site Reliability Engineer
Principal Site Reliability Engineer
Vespaai · Trondheim, 7011, Norway · On Site · Active · BambooHR
Job facts
| Field | Value |
|---|---|
| Company | Vespaai |
| Title | Principal Site Reliability Engineer |
| Normalized title | - |
| Department / team | Engineering |
| Location | Trondheim |
| Work model | On Site |
| Employment type | Full Time |
| Salary | - |
| Status | active |
| ATS provider | BambooHR |
| Posted / first seen | 2026-06-02 / 2026-06-04 |
| Changed / last seen | 2026-06-04 / 2026-06-06 |
Related slices
| Page | What it contains | Open |
|---|---|---|
| Company jobs | Active postings from Vespaai. | Open |
| Company breakdowns | Role, location, ATS, and work model facets for this company. | Open |
| ATS provider jobs | Active postings observed through BambooHR. | Open |
| Provider filtered search | The same provider as a filtered job collection. | Open |
| City jobs | Active postings in Trondheim. | Open |
| Department jobs | Active postings in Engineering. | Open |
| Work model jobs | Active On Site postings. | Open |
| Lifecycle events | Open, update, close, and reopen events for this posting. | Open |
| Original posting | Canonical source or apply URL captured from the ATS. | Open |
Linked records
| Company | Vespaai |
| Source | 9d200a65-e380-4385-b298-728b197981bc |
| ATS provider | BambooHR |
Description
Does it sound interesting to work on an open source platform managing the data and real-time search and inference for some of the largest companies in the world? Would you thrive on keeping large, globally distributed systems reliable, fast, and observable — and on building the practices and tooling that let a small team operate at massive scale? If so, we want you to join our team at Vespa.ai as a Principal Site Reliability Engineer!
About Vespa.ai:
Vespa.ai is a team of passionate builders. We maintain and develop the Apache 2.0 licensed open-source AI search platform Vespa.
Vespa is a fully featured search engine and vector database. It supports vector search (ANN), lexical search, and structured data search, all in a single query. Integrated machine-learning model inference enables the application of AI to make sense of data in real time. Together with Vespa’s proven scalability and high availability, this empowers to create production-ready search applications at any scale and with any combination of features. Our users and customers are #1 in e-commerce, content, and financial services globally, and are used by companies such as Perplexity, Spotify, Yahoo, Wix, and many more.
In addition to our open-source platform, Vespa.ai develops and runs Vespa Cloud, a robust SaaS offering that allows businesses to harness the power of our technology with ease.
At Vespa.ai, we are extremely focused on automating everything we do to grow fast and maintain high quality. In all roles, we scale through technology, not simply by adding larger teams. We take pride in being small, nimble, and the most productive.
Position overview
At Vespa.ai, we embrace DevOps as a company culture, seeking to solve technical problems with automation and code rather than repetitive manual effort. For our Vespa Cloud production systems, we have had this mindset from day one.
We are seeking a Principal Site Reliability Engineer to join our team and help keep Vespa Cloud reliable, fast, and observable at global scale. This is a senior individual contributor role on the team that operates and improves our production systems. You will also help shape and develop our approach to SRE and DevOps as we grow. We are looking for a strong engineer who earns influence through contributions and has the ambition to take on greater responsibility over time. You will also participate in our 24x7 on-call rotation, approximately every third to fourth week.
At our Trondheim office, we work office-first: you will be based on-site most of the time, with the flexibility to work from home/remotely when needed, as agreed with your manager.
Responsibilities
Help ensure the reliability, availability, and performance of Vespa Cloud production systems running globally at scale.
Participate in a 24x7 on-call rotation (approximately every 3rd–4th week), lead incident response, and drive blameless postmortems through to durable fixes.
Help define and track SLOs/SLIs, and build proactive alerting, capacity planning, and remediation strategies.
Design and improve observability — metrics, logging, and tracing — across a large fleet.
Eliminate operational toil by solving problems with automation and code rather than manual effort.
Contribute to, and help shape, our SRE and DevOps practices and culture as the organization grows, sharing knowledge and mentoring across the team.
Work with the rest of the Vespa.ai developing team on reliability, scalability, and architecture.
Qualifications
5–10 years building and operating large-scale production systems, with deep SRE/DevOps experience.
Solid programming skills in Java, Python, Go, or similar languages.
Good understanding of sound software engineering principles and practices.
Experience with cloud platforms (AWS, Azure, or GCP).
Solid understanding of networking, operating systems, distributed systems, and security principles.
Proven incident management and on-call experience.
A track record of influencing technical direction and improving how teams work — not just executing tickets.
Excellent problem-solving and analytical skills, and the ability to lead through influence as well as work independently.
Desired Skills
Experience with Infrastructure as Code tools such as Terraform, Tofu, Spacelift, etc.
Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry, ELK).
Experience with CI/CD tooling such as GitHub Actions, Buildkite, etc.
Experience operating data-intensive or stateful systems at scale.
Experience defining SLOs and establishing reliability programs.
Ambitions beyond pure SRE — an interest in growing, over time, into a technical leadership role.
Some of Our Tools and Services
JumpCloud, Google Workspace, and Slack
GitHub Enterprise Cloud (including GitHub Actions)
Jira Cloud and Jira Service Desk
StrongDM, Grafana, Spacelift, and Buildkite
AWS, GCP, and Azure
Why Join Us:
Opportunities for professional growth and development as part of one of Europe’s most exciting start-ups!
Be part of a cutting-edge team working on innovative search and recommendation technology.
Work on a team where we don’t believe in silos between engineers; there aren’t “developers”, “ops people”, and “sysadmins”. We’re all engineers solving problems the smart way together!
Competitive salary and benefits.
Note: Vespa.ai is an equal-opportunity employer. We are committed to creating an inclusive environment for all employees. We believe in fostering a collaborative and inclusive environment where every team member has the opportunity to make a significant impact.
Full job record
| Job ID | 6b308282724e1eac8638f54e27889bc26c7cc5b4 |
| Org ID | f873c7fd-4cc3-425e-8814-3e98b95228ce |
| Source ID | 9d200a65-e380-4385-b298-728b197981bc |
| Board ID | 9d200a65-e380-4385-b298-728b197981bc |
| Provider | bamboohr |
| Provider Job Key | 42 |
| Title | Principal Site Reliability Engineer |
| Normalized Title | — |
| Status | active |
| Active | yes |
| Location Text | Trondheim, 7011, Norway |
| Department | Engineering |
| Team | — |
| Employment Type | full_time |
| Workplace Type | on_site |
| Remote Policy | — |
| Country | — |
| Region | — |
| City | Trondheim |
| Salary Raw | — |
| Salary Min | — |
| Salary Max | — |
| Salary Currency | — |
| Salary Period | — |
| Source URL | https://vespaai.bamboohr.com/careers/42 |
| Apply URL | https://vespaai.bamboohr.com/careers/42 |
| First Seen At | 2026-06-04 11:39:07Z |
| Last Seen At | 2026-06-06 10:27:42Z |
| Last Checked At | 2026-06-06 10:27:42Z |
| Last Changed At | 2026-06-04 11:39:07Z |
| Inactive At | — |
| Source Posted At | 2026-06-02 00:00:00Z |
| Source Updated At | — |
| Raw Payload Uri | s3://job-postings-prod-raw-590183727216/raw/provider=bamboohr/board=vespaai/date=2026-06-06/2026-06-06T10-27-40-953Z-ee386ec2a013beb95415fad3628be0699401685ec87c68c205c695e9e85dd9a2.json |
Event Fields
{
"content_hash": "ab0659e1c2ea9d7cc46bed7cbef1f8be39cbff66ba6ce26b7ee67d649fa3b0be",
"source_hash": "d8d53bde4cd2f0618555de2d45e42400691a8d81a61177c937626ad9eabb5e1b",
"last_changed_at": "2026-06-04T11:39:07.831Z",
"active_status": "active"
}Parsed Structured
{
"language": "en",
"location": {
"raw": "Trondheim, 7011, Norway",
"city": "Trondheim",
"region": null,
"country": null,
"is_remote": false,
"confidence": 0.8
},
"salary_max": null,
"salary_min": null,
"inferred_at": "2026-06-06T10:27:42.184Z",
"launch_scope": {
"reason": "bamboohr_production_catalog",
"included": true,
"location": {
"raw": "Trondheim, 7011, Norway",
"city": "Trondheim",
"region": null,
"country": null,
"is_remote": false,
"confidence": 0.8
},
"countries": []
},
"remote_policy": null,
"salary_period": null,
"workplace_type": "on_site",
"salary_currency": null
}Extensions
{}Native Structured
{
"list_job": {
"id": "42",
"isRemote": null,
"location": {
"city": "Trondheim",
"state": null
},
"atsLocation": {
"city": null,
"state": null,
"country": null,
"province": null
},
"departmentId": "18593",
"locationType": "0",
"jobOpeningName": "Principal Site Reliability Engineer",
"departmentLabel": "Engineering",
"employmentStatusLabel": "Full-Time"
},
"detail_errors": [],
"detail_job_opening": {
"location": {
"city": "Trondheim",
"state": null,
"postalCode": "7011",
"addressCountry": "Norway"
},
"datePosted": "2026-06-02",
"atsLocation": {
"city": null,
"state": null,
"country": null,
"countryId": null
},
"description": "<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Does it sound interesting to work on an open source platform managing the data and real-time search and inference for some of the largest companies in the world? Would you thrive on keeping large, globally distributed systems reliable, fast, and observable — and on building the practices and tooling that let a small team operate at massive scale? If so, we want you to join our team at Vespa.ai as a Principal Site Reliability Engineer!<br><br></span></p>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt; font-weight: bold\">About Vespa.ai:</span><span style=\"font-family: Arial, sans-serif; font-size: 10pt\"><br></span><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Vespa.ai is a team of passionate builders. We maintain and develop the Apache 2.0 licensed open-source AI search platform Vespa. </span></p>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Vespa is a fully featured search engine and vector database. It supports vector search (ANN), lexical search, and structured data search, all in a single query. Integrated machine-learning model inference enables the application of AI to make sense of data in real time. Together with Vespa’s proven scalability and high availability, this empowers to create production-ready search applications at any scale and with any combination of features. Our users and customers are #1 in e-commerce, content, and financial services globally, and are used by companies such as Perplexity, Spotify, Yahoo, Wix, and many more.</span></p>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">In addition to our open-source platform, Vespa.ai develops and runs Vespa Cloud, a robust SaaS offering that allows businesses to harness the power of our technology with ease.</span></p>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">At Vespa.ai, we are extremely focused on automating everything we do to grow fast and maintain high quality. In all roles, we scale through technology, not simply by adding larger teams. We take pride in being small, nimble, and the most productive.</span></p>\n<p><br></p>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt; font-weight: bold\">Position overview</span><span style=\"font-family: Arial, sans-serif; font-size: 10pt\"><br></span><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">At Vespa.ai, we embrace DevOps as a company culture, seeking to solve technical problems with automation and code rather than repetitive manual effort. For our Vespa Cloud production systems, we have had this mindset from day one.</span></p>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">We are seeking a Principal Site Reliability Engineer to join our team and help keep Vespa Cloud reliable, fast, and observable at global scale. This is a senior individual contributor role on the team that operates and improves our production systems. You will also help shape and develop our approach to SRE and DevOps as we grow. We are looking for a strong engineer who earns influence through contributions and has the ambition to take on greater responsibility over time. You will also participate in our 24x7 on-call rotation, approximately every third to fourth week.</span></p>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">At our Trondheim office, we work office-first: you will be based on-site most of the time, with the flexibility to work from home/remotely when needed, as agreed with your manager.</span></p>\n<p><br></p>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt; font-weight: bold\">Responsibilities</span></p>\n<ul>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Help ensure the reliability, availability, and performance of Vespa Cloud production systems running globally at scale.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Participate in a 24x7 on-call rotation (approximately every 3rd–4th week), lead incident response, and drive blameless postmortems through to durable fixes.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Help define and track SLOs/SLIs, and build proactive alerting, capacity planning, and remediation strategies.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Design and improve observability — metrics, logging, and tracing — across a large fleet.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Eliminate operational toil by solving problems with automation and code rather than manual effort.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Contribute to, and help shape, our SRE and DevOps practices and culture as the organization grows, sharing knowledge and mentoring across the team.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Work with the rest of the Vespa.ai developing team on reliability, scalability, and architecture.<br></span><br></li>\n</ul>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt; font-weight: bold\">Qualifications</span></p>\n<ul>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">5–10 years building and operating large-scale production systems, with deep SRE/DevOps experience.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Solid programming skills in Java, Python, Go, or similar languages.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Good understanding of sound software engineering principles and practices.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Experience with cloud platforms (AWS, Azure, or GCP).</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Solid understanding of networking, operating systems, distributed systems, and security principles.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Proven incident management and on-call experience.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">A track record of influencing technical direction and improving how teams work — not just executing tickets.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Excellent problem-solving and analytical skills, and the ability to lead through influence as well as work independently.<br></span><br></li>\n</ul>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt; font-weight: bold\">Desired Skills</span></p>\n<ul>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Experience with Infrastructure as Code tools such as Terraform, Tofu, Spacelift, etc.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry, ELK).</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Experience with CI/CD tooling such as GitHub Actions, Buildkite, etc.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Experience operating data-intensive or stateful systems at scale.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Experience defining SLOs and establishing reliability programs.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Ambitions beyond pure SRE — an interest in growing, over time, into a technical leadership role.<br><br></span></li>\n</ul>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt; font-weight: bold\">Some of Our Tools and Services</span></p>\n<ul>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">JumpCloud, Google Workspace, and Slack</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">GitHub Enterprise Cloud (including GitHub Actions)</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Jira Cloud and Jira Service Desk</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">StrongDM, Grafana, Spacelift, and Buildkite</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">AWS, GCP, and Azure<br><br></span></li>\n</ul>\n<p><span style=\"font-family: Arial, sans-serif; font-size: 10pt; font-weight: bold\">Why Join Us:</span></p>\n<ul>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Opportunities for professional growth and development as part of one of Europe’s most exciting start-ups!</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Be part of a cutting-edge team working on innovative search and recommendation technology.</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Work on a team where we don’t believe in silos between engineers; there aren’t “developers”, “ops people”, and “sysadmins”. We’re all engineers solving problems the smart way together!</span></li>\n<li><span style=\"font-family: Arial, sans-serif; font-size: 10pt\">Competitive salary and benefits.</span></li>\n</ul>\n<p><br></p>\n<p><span style=\"font-family: arial, helvetica, sans-serif; font-size: 10pt\"><span style=\"color: rgb(72, 65, 63); font-weight: bold\">Note: </span><span style=\"color: rgb(72, 65, 63)\">Vespa.ai is an equal-opportunity employer. We are committed to creating an inclusive environment for all employees. We believe in fostering a collaborative and inclusive environment where every team member has the opportunity to make a significant impact.</span></span></p>",
"compensation": null,
"departmentId": "18593",
"locationType": "0",
"seekPromoted": false,
"jobCategoryId": null,
"jobOpeningName": "Principal Site Reliability Engineer",
"departmentLabel": "Engineering",
"jobOpeningStatus": "Open",
"minimumExperience": null,
"jobOpeningShareUrl": "https://vespaai.bamboohr.com/careers/42",
"employmentStatusLabel": "Full-Time"
}
}Get this page with API
Rendered from the bluedoor Job Postings API. Reproduce it:
GET https://api.bluedoor.sh/job-postings/v1/jobs/6b308282724e1eac8638f54e27889bc26c7cc5b4?include=descriptionJSONGET https://api.bluedoor.sh/job-postings/v1/orgs/f873c7fd-4cc3-425e-8814-3e98b95228ceJSONGET https://api.bluedoor.sh/job-postings/v1/sources/9d200a65-e380-4385-b298-728b197981bcJSONGET https://api.bluedoor.sh/job-postings/v1/jobs/6b308282724e1eac8638f54e27889bc26c7cc5b4/eventsJSON