Home › Companies › Gridware › Senior Cloud Engineer
Senior Cloud Engineer
Gridware · San Francisco, CA · Hybrid · Active · $190,000–$215,000 / year · Lever
Job facts
| Field | Value |
|---|---|
| Company | Gridware |
| Title | Senior Cloud Engineer |
| Normalized title | - |
| Department / team | Software |
| Location | San Francisco, CA, United States |
| Work model | Hybrid / Hybrid |
| Employment type | Full Time |
| Salary | $190,000–$215,000 / year |
| Status | active |
| ATS provider | Lever |
| Posted / first seen | 2026-05-07 / 2026-05-29 |
| Changed / last seen | 2026-06-04 / 2026-06-06 |
Related slices
| Page | What it contains | Open |
|---|---|---|
| Company jobs | Active postings from Gridware. | Open |
| Company breakdowns | Role, location, ATS, and work model facets for this company. | Open |
| ATS provider jobs | Active postings observed through Lever. | Open |
| Provider filtered search | The same provider as a filtered job collection. | Open |
| City jobs | Active postings in San Francisco. | Open |
| Work model jobs | Active Hybrid postings. | Open |
| Lifecycle events | Open, update, close, and reopen events for this posting. | Open |
| Original posting | Canonical source or apply URL captured from the ATS. | Open |
Linked records
| Company | Gridware |
| Source | ab8506f1-0d82-4bde-b310-1fc0dc525c2a |
| ATS provider | Lever |
Description
About Gridware
Gridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid. We pioneered a groundbreaking new class of grid management called active grid response (AGR), focused on monitoring the electrical, physical, and environmental aspects of the grid that affect reliability and safety. Gridware’s advanced Active Grid Response platform uses high-precision sensors to detect potential issues early, enabling proactive maintenance and fault mitigation. This comprehensive approach helps improve safety, reduce outages, and ensure the grid operates efficiently. The company is backed by climate-tech and Silicon Valley investors. For more information, please visit www.Gridware.io.
Role Description
We’re scaling the deployment of critical infrastructure monitoring devices to detect real-world fault events that lead to wildfires. The platform you’ll build and operate ingests millions of events per day from devices in the field, powers customer-facing dashboards and alerting, and supports the data science work that turns raw signals into grid intelligence.
You will own AWS infrastructure, Kubernetes (EKS), CI/CD, and observability end-to-end, partnering with our Cloud Security team to keep the platform safe and compliant, and with backend, firmware, and data teams to keep them shipping fast. As an early member of the DevOps team, you’ll have a direct hand in shaping how Gridware builds, deploys, and runs production systems for years to come.
This describes the ideal candidate; many of us have picked up this expertise along the way. Even if you meet only part of this list, we encourage you to apply!
Benefits
Health, Dental & Vision (Gold and Platinum with some providers plans fully covered)
Paid parental leave
Alternating day off (every other Monday)
“Off the Grid”, a two week per year paid break for all employees.
Commuter allowance
Company-paid training
Responsibilities
Design, build, and operate scalable, secure, and highly available cloud infrastructure across AWS.
Own and evolve our Kubernetes platform, enabling reliable application deployment and operations through GitOps best practices.
Build and maintain CI/CD systems that improve developer velocity, release quality, and operational reliability.
Manage and optimize event-driven infrastructure powering high-volume telemetry and device data pipelines.
Define and maintain Infrastructure as Code standards, ensuring consistency, repeatability, and scalability across environments.
Develop and enhance observability, monitoring, and incident response capabilities to support reliable production operations.
Partner closely with Security and Engineering teams to strengthen platform security, access management, and operational resilience.
Troubleshoot complex production issues, drive root cause analysis, and turn lessons learned into automation, tooling, and operational improvements.
Required Skills
5+ years of experience in DevOps, SRE, or Platform Engineering operating production AWS environments
Deep expertise with Kubernetes (EKS preferred), GitOps workflows (Argo CD/Flux), and Infrastructure as Code (Terraform)
Strong experience building and maintaining CI/CD pipelines, ideally with GitHub Actions
Hands-on experience operating distributed systems and cloud-native platforms (e.g., Kafka/MSK)
Solid understanding of networking, DNS, TLS, identity/access management, and cloud security best practices
Experience with observability, monitoring, and logging tools such as Grafana, Prometheus, Loki, or similar
Strong Linux, scripting, and troubleshooting skills with the ability to debug complex production issues end-to-end
Bonus Skills
Experience operating Apollo Router / GraphQL federation gateways in production.
Experience operating Argo Workflows or similar Kubernetes-native job / pipeline runners in production.
Familiarity with Databricks or ML Ops pipelines for data and model deployment.
Experience designing, operating, and exercising Disaster Recovery (DR) environments, including cross-region replication, backups, and tested failover runbooks.
Experience with Tailscale or other zero-trust networking tools.
Experience supporting IoT / embedded fleets at scale, including secure device-to-cloud connectivity.
Experience in high-growth startup environments where you must wear many hats.
Full job record
| Job ID | 3afd1c78801ec253614b10e8bbeca25691c01356 |
| Org ID | 4dbad03a-9aed-4786-9450-f1483b2c9bef |
| Source ID | ab8506f1-0d82-4bde-b310-1fc0dc525c2a |
| Board ID | ab8506f1-0d82-4bde-b310-1fc0dc525c2a |
| Provider | lever |
| Provider Job Key | cebb7035-b458-41cb-8b8a-3989aca21793 |
| Title | Senior Cloud Engineer |
| Normalized Title | — |
| Status | active |
| Active | yes |
| Location Text | San Francisco, CA |
| Department | — |
| Team | Software |
| Employment Type | Full-Time |
| Workplace Type | hybrid |
| Remote Policy | hybrid |
| Country | United States |
| Region | CA |
| City | San Francisco |
| Salary Raw | USD 190000-215000 per-year-salary |
| Salary Min | 190,000 |
| Salary Max | 215,000 |
| Salary Currency | USD |
| Salary Period | year |
| Source URL | https://jobs.lever.co/gridware/cebb7035-b458-41cb-8b8a-3989aca21793 |
| Apply URL | https://jobs.lever.co/gridware/cebb7035-b458-41cb-8b8a-3989aca21793/apply |
| First Seen At | 2026-05-29 07:01:10Z |
| Last Seen At | 2026-06-06 07:56:45Z |
| Last Checked At | 2026-06-06 07:56:45Z |
| Last Changed At | 2026-06-04 11:35:03Z |
| Inactive At | — |
| Source Posted At | 2026-05-07 17:27:09Z |
| Source Updated At | — |
| Raw Payload Uri | s3://job-postings-prod-raw-590183727216/raw/provider=lever/board=gridware/date=2026-06-06/2026-06-06T07-56-45-575Z-bb182561a1935f4edb03a8c1deee89f20440ffe4f97e5ae090582807e89a2064.json |
Event Fields
{
"content_hash": "aad3a320e69fe62442432e699c7d49adac62d80fc4edd20091f2f744a7876bc8",
"source_hash": "dd685c14e827ac4ee8b2b5999cdc22c66c24bd1c9f185dab3e0c373fffe9c942",
"last_changed_at": "2026-06-04T11:35:03.854Z",
"active_status": "active"
}Parsed Structured
{
"language": "en",
"location": {
"raw": "San Francisco, CA",
"city": "San Francisco",
"region": "CA",
"country": "United States",
"is_remote": false,
"confidence": 0.9
},
"salary_max": 215000,
"salary_min": 190000,
"inferred_at": "2026-06-06T07:56:45.752Z",
"launch_scope": {
"reason": "english_us_canada",
"included": true,
"language": "en",
"location": {
"raw": "San Francisco, CA",
"city": "San Francisco",
"region": "CA",
"country": "United States",
"is_remote": false,
"confidence": 0.9
},
"countries": [
"United States"
]
},
"remote_policy": "hybrid",
"salary_period": "year",
"workplace_type": "hybrid",
"salary_currency": "USD"
}Extensions
{}Native Structured
{
"lists": [
{
"text": "Responsibilities ",
"content": "\n<li>Design, build, and operate scalable, secure, and highly available cloud infrastructure across AWS.</li>\n<li>Own and evolve our Kubernetes platform, enabling reliable application deployment and operations through GitOps best practices.</li>\n<li>Build and maintain CI/CD systems that improve developer velocity, release quality, and operational reliability.</li>\n<li>Manage and optimize event-driven infrastructure powering high-volume telemetry and device data pipelines.</li>\n<li>Define and maintain Infrastructure as Code standards, ensuring consistency, repeatability, and scalability across environments.</li>\n<li>Develop and enhance observability, monitoring, and incident response capabilities to support reliable production operations.</li>\n<li>Partner closely with Security and Engineering teams to strengthen platform security, access management, and operational resilience.</li>\n<li>Troubleshoot complex production issues, drive root cause analysis, and turn lessons learned into automation, tooling, and operational improvements.</li>\n"
},
{
"text": "Required Skills",
"content": "\n<li>5+ years of experience in DevOps, SRE, or Platform Engineering operating production AWS environments</li>\n<li>Deep expertise with Kubernetes (EKS preferred), GitOps workflows (Argo CD/Flux), and Infrastructure as Code (Terraform)</li>\n<li>Strong experience building and maintaining CI/CD pipelines, ideally with GitHub Actions</li>\n<li>Hands-on experience operating distributed systems and cloud-native platforms (e.g., Kafka/MSK)</li>\n<li>Solid understanding of networking, DNS, TLS, identity/access management, and cloud security best practices</li>\n<li>Experience with observability, monitoring, and logging tools such as Grafana, Prometheus, Loki, or similar</li>\n<li>Strong Linux, scripting, and troubleshooting skills with the ability to debug complex production issues end-to-end</li>\n"
},
{
"text": "Bonus Skills",
"content": "\n<li>Experience operating Apollo Router / GraphQL federation gateways in production.</li>\n<li>Experience operating Argo Workflows or similar Kubernetes-native job / pipeline runners in production.</li>\n<li>Familiarity with Databricks or ML Ops pipelines for data and model deployment.</li>\n<li>Experience designing, operating, and exercising Disaster Recovery (DR) environments, including cross-region replication, backups, and tested failover runbooks.</li>\n<li>Experience with Tailscale or other zero-trust networking tools.</li>\n<li>Experience supporting IoT / embedded fleets at scale, including secure device-to-cloud connectivity.</li>\n<li>Experience in high-growth startup environments where you must wear many hats.</li>\n"
}
],
"country": "US",
"createdAt": 1778174829826,
"updatedAt": null,
"categories": {
"team": "Software",
"location": "San Francisco, CA",
"commitment": "Full-Time",
"allLocations": [
"San Francisco, CA"
]
},
"salaryRange": {
"max": 215000,
"min": 190000,
"currency": "USD",
"interval": "per-year-salary"
},
"workplaceType": "hybrid"
}Get this page with API
Rendered from the bluedoor Job Postings API. Reproduce it:
GET https://api.bluedoor.sh/job-postings/v1/jobs/3afd1c78801ec253614b10e8bbeca25691c01356?include=descriptionJSONGET https://api.bluedoor.sh/job-postings/v1/orgs/4dbad03a-9aed-4786-9450-f1483b2c9befJSONGET https://api.bluedoor.sh/job-postings/v1/sources/ab8506f1-0d82-4bde-b310-1fc0dc525c2aJSONGET https://api.bluedoor.sh/job-postings/v1/jobs/3afd1c78801ec253614b10e8bbeca25691c01356/eventsJSON