Home › Companies › Scribe › Staff Database Reliability Engineer
Staff Database Reliability Engineer
Scribe · Remote · Remote · Active · Ashby
Job facts
| Field | Value |
|---|---|
| Company | Scribe |
| Title | Staff Database Reliability Engineer |
| Normalized title | - |
| Department / team | Engineering / Engineering |
| Location | San Francisco, CA, United States |
| Work model | Remote / Remote |
| Employment type | Full Time |
| Salary | - |
| Status | active |
| ATS provider | Ashby |
| Posted / first seen | — / 2026-05-29 |
| Changed / last seen | 2026-05-30 / 2026-06-06 |
Related slices
| Page | What it contains | Open |
|---|---|---|
| Company jobs | Active postings from Scribe. | Open |
| Company breakdowns | Role, location, ATS, and work model facets for this company. | Open |
| ATS provider jobs | Active postings observed through Ashby. | Open |
| Provider filtered search | The same provider as a filtered job collection. | Open |
| City jobs | Active postings in San Francisco. | Open |
| Department jobs | Active postings in Engineering. | Open |
| Work model jobs | Active Remote postings. | Open |
| Lifecycle events | Open, update, close, and reopen events for this posting. | Open |
| Original posting | Canonical source or apply URL captured from the ATS. | Open |
Linked records
| Company | Scribe |
| Source | f409169b-0da5-492b-86be-2bff4efaa15d |
| ATS provider | Ashby |
Description
About the role We're hiring a Staff Database Reliability Engineer to own the strategy, architecture, and operational excellence of our data infrastructure. This is an expert-level IC role with deep influence on engineering direction, partnering closely with platform, backend, and DevOps engineers.
Why this role matters You will own the data tier end-to-end. Design schemas and access patterns that scale, tune Aurora for latency and throughput, and set the standards for how engineers interact with our databases. When a migration script seizes up mid-deploy and writes start queueing behind an ACCESS EXCLUSIVE lock, your runbooks and automation resolve the incident quickly.
Make the Django ORM a strength, not a liability:
Review migrations for safety at scale — locks, backfills, concurrent index builds, NOT VALID constraints
Catch N+1 patterns and missing select_related / prefetch_related in review
Establish conventions for QuerySet usage and physical schema design (indexes, constraints, partitioning)
Scale review through automation, not heroics — author AGENTS.md files and DNA scaffolding that encode our conventions, configure AI review bots (Claude Code, Cursor, etc.) to flag risky migrations and ORM anti-patterns, and iterate on those configs as new failure modes emerge
Lead major infrastructure initiatives:
Capacity planning as traffic and engineering throughput grow
Zero-downtime schema migrations and cutovers
Multi-AZ resilience within a single region — Aurora writer/reader placement, failover behavior and RTO/RPO, ElastiCache and OpenSearch AZ topology, RabbitMQ survivability across AZs
Backups, PITR, failover testing, retention
Own the CDC pipeline (Aurora → DMS → S3 Parquet → Snowflake):
DMS task design and tuning, replication slot hygiene on the Postgres side
Schema evolution as Django migrations roll through — so a column rename doesn't silently break the warehouse at 6 AM
Parquet layout and partitioning, reliability of the Snowflake handoff
Automated checks that flag migrations likely to break downstream consumers
Drive observability across three complementary tools:
pganalyze — query-level performance, index advisor, schema insights - the go-to for "why is this ORM query slow"
CloudWatch — infrastructure metrics and alarms for Aurora, OpenSearch, ElastiCache, SQS, DMS
Honeycomb — high-cardinality tracing that ties slow DB calls back to users, flags, deploys, and flows
Shape how the three fit together, including Django-side instrumentation and trace attributes on ORM queries
Build tooling and guardrails:
Migration review automation and CI checks for risky patterns
Slow query pipelines fed from pganalyze
Self-service dashboards so teams understand their own query footprint
Support and evolve the rest of the stack:
OpenSearch — index design, sharding, mapping changes, reindexing strategy, Django-side indexing pipelines
Redis — caching patterns, eviction, sizing, Django cache framework, Celery/RQ usage, avoiding hot keys and thundering herds
SQS + RabbitMQ — queue design, DLQs, visibility timeouts, exchange/queue topology, AZ mirroring, consumer backpressure, Celery behavior under load
What makes you a great fit Core expertise:
Deep PostgreSQL — EXPLAIN (ANALYZE, BUFFERS), MVCC, bloat, lock contention, vacuum/autovacuum. Aurora Serverless V2 / Limitless experience strongly preferred (storage model, reader/writer split, ACU scaling)
Strong ORM fluency (Django, SQLAlchemy, ActiveRecord, or similar) — predict the SQL a query will generate, spot N+1 problems on sight and how to control eager loading (joins vs. batched IN queries), column projection, aggregations, and subqueries
Single-region multi-AZ design — practical understanding of what it does and doesn't protect against
Data movement and observability:
Production CDC experience, ideally AWS DMS — comfortable with logical replication, slot hygiene, schema evolution, and Parquet-based data lakes feeding Snowflake (or BigQuery/Redshift)
Hands-on with pganalyze (or Datadog DBM / Performance Insights / pg_stat_statements pipelines), CloudWatch (custom metrics, composite alarms, log insights), and Honeycomb (or another high-cardinality tracing tool) — comfortable with OpenTelemetry and opinionated about what makes a trace useful
AI-assisted workflow:
Real experience making AI coding and review tools useful for a team — writing AGENTS.md files, configuring review agents, versioning and iterating on prompts and configs
The rest of the stack:
OpenSearch at scale — sizing, sharding, JVM tuning, rolling upgrades, snapshots
Production Redis — persistence tradeoffs, cluster mode, hot keys, thundering herds
At least one production message broker (SQS, RabbitMQ, Kafka) — delivery semantics, idempotency, failure modes
Engineering and leadership:
Strong automation and IaC background — real code (Python, Go, or similar) and Terraform
Track record leading cross-team initiatives, writing design docs that hold up, influencing without authority
Comfortable in a high-growth environment where the right answer for 50 engineers isn't the right answer for 100
Pragmatic outlook during incidents — focused on preventing the next one
Full-Time US Employee Benefits Include Some of the nicest and smartest teammates you’ll ever work with
Competitive salaries
Comprehensive healthcare benefits
Exciting and motivating equity
Flexible PTO
401k
Parental Leave
Commuter Benefits (SF office employees)
WFH Stipend
Compensation
We benchmark compensation using trusted market data and apply a tiered geographic framework to ensure competitive pay across locations. The ranges below represent the base salary band for this role by tier. Final offers are determined by experience, scope, internal parity, and location.
$230k-$280k base + equity
We consider several factors when determining compensation, including location, experience, and other job-related factors.
At Scribe, we celebrate our differences and are committed to creating a workplace where all employees feel supported and empowered to do their best work. We believe this benefits not only our employees but our product, customers, and community as well. Scribe is proud to be an Equal Opportunity Employer.
Full job record
| Job ID | 4b3c71f6b9f20b66573c310d9ed3d85325d6125b |
| Org ID | 2502c8f0-18d7-480e-ab0f-6a967da9c435 |
| Source ID | f409169b-0da5-492b-86be-2bff4efaa15d |
| Board ID | f409169b-0da5-492b-86be-2bff4efaa15d |
| Provider | ashby |
| Provider Job Key | ccafdcaf-3249-4a85-adb0-7c865dbd045b |
| Title | Staff Database Reliability Engineer |
| Normalized Title | — |
| Status | active |
| Active | yes |
| Location Text | Remote |
| Department | Engineering |
| Team | Engineering |
| Employment Type | full_time |
| Workplace Type | remote |
| Remote Policy | remote |
| Country | United States |
| Region | CA |
| City | San Francisco |
| Salary Raw | — |
| Salary Min | — |
| Salary Max | — |
| Salary Currency | — |
| Salary Period | — |
| Source URL | https://jobs.ashbyhq.com/scribe/ccafdcaf-3249-4a85-adb0-7c865dbd045b |
| Apply URL | https://jobs.ashbyhq.com/scribe/ccafdcaf-3249-4a85-adb0-7c865dbd045b/application |
| First Seen At | 2026-05-29 06:56:12Z |
| Last Seen At | 2026-06-06 09:34:32Z |
| Last Checked At | 2026-06-06 09:34:32Z |
| Last Changed At | 2026-05-30 08:04:05Z |
| Inactive At | — |
| Source Posted At | — |
| Source Updated At | — |
| Raw Payload Uri | s3://job-postings-prod-raw-590183727216/raw/provider=ashby/board=scribe/date=2026-06-06/2026-06-06T09-34-06-069Z-a0a112f7fc97ebd18765b80a7566216f3ff9d7987d13582016dfa4aed16e676f.json |
Event Fields
{
"content_hash": "e24d87feaa74bd5e68725d8b6d0d65d39edad3a9f22d6d1b8383f90d1ff4bb66",
"source_hash": "1f4e45a37c990e5b0219464b24ebb74fceeba983a3ca58d5facadd1375de20e9",
"last_changed_at": "2026-05-30T08:04:05.748Z",
"active_status": "active"
}Parsed Structured
{
"language": "en",
"location": {
"raw": "San Francisco",
"city": "San Francisco",
"region": "CA",
"country": "United States",
"is_remote": true,
"confidence": 0.75
},
"salary_max": null,
"salary_min": null,
"inferred_at": "2026-06-06T09:34:32.743Z",
"launch_scope": {
"reason": "english_us_canada",
"included": true,
"language": "en",
"location": {
"raw": "San Francisco",
"city": "San Francisco",
"region": "CA",
"country": "United States",
"is_remote": true,
"confidence": 0.75
},
"countries": [
"United States"
]
},
"remote_policy": "remote",
"salary_period": null,
"workplace_type": "remote",
"salary_currency": null
}Extensions
{}Native Structured
{
"id": "ccafdcaf-3249-4a85-adb0-7c865dbd045b",
"team": "Engineering",
"title": "Staff Database Reliability Engineer",
"jobUrl": "https://jobs.ashbyhq.com/scribe/ccafdcaf-3249-4a85-adb0-7c865dbd045b",
"address": null,
"applyUrl": "https://jobs.ashbyhq.com/scribe/ccafdcaf-3249-4a85-adb0-7c865dbd045b/application",
"isListed": true,
"isRemote": true,
"location": "Remote ",
"updatedAt": null,
"apiVersion": "ashby-non-user-graphql-v1",
"department": "Engineering",
"publishedAt": null,
"workplaceType": "Remote",
"employmentType": "FullTime",
"secondaryLocations": [
{
"location": "San Francisco"
}
]
}Get this page with API
Rendered from the bluedoor Job Postings API. Reproduce it:
GET https://api.bluedoor.sh/job-postings/v1/jobs/4b3c71f6b9f20b66573c310d9ed3d85325d6125b?include=descriptionJSONGET https://api.bluedoor.sh/job-postings/v1/orgs/2502c8f0-18d7-480e-ab0f-6a967da9c435JSONGET https://api.bluedoor.sh/job-postings/v1/sources/f409169b-0da5-492b-86be-2bff4efaa15dJSONGET https://api.bluedoor.sh/job-postings/v1/jobs/4b3c71f6b9f20b66573c310d9ed3d85325d6125b/eventsJSON