Home › Sources › hf_sf_criminal_court
jamiequint/sf_criminal_court Hugging Face Dataset
Source ID hf_sf_criminal_court. Use source caveats and join keys before treating context records as court facts.
Source overview
| Source ID | hf_sf_criminal_court |
| Name | jamiequint/sf_criminal_court Hugging Face Dataset |
| Owner | Independent public dataset compiler using SFSC and public agency sources |
| Layer | reference_enrichment |
| Coverage | Public Hugging Face parquet snapshot last modified 2026-05-04 with 13 tables: 77,406 cases, 776,728 ROA rows, 72,289 attorney rows, 318,993 calendar rows, 318,993 calendar+judge-assignment rows, 44,029 SFSC charge-disposition rows, 13,790 inferred SFSC case matches, DA arrest/prosecuted tables, and judicial assignment dimensions. The prototype now loads a bounded 120-case reference extract into the API: 120 cases, 4,079 docket rows, 452 attorney rows, 1,844 judge-enriched hearing rows, 340 charge-disposition rows, and 292 court-charge outcome rows. |
| Formats | Parquet |
| Join keys | case_id, case_number, court_number, filed_date, charge_multiset, department, event_date |
| Caveats | Hugging Face card declares CC-BY-NC-4.0; production/commercial use needs legal review or official-source re-derivation., Cases, ROA, attorneys, and calendar rows are scraped/derived reference data and should not outrank a fresh official SFSC scrape., Rule 10.500 charge-disposition rows are court-owned facts, but public case-number joins are deterministic-inferred because the court spreadsheet used anonymized IDs., Judge names on calendar rows are published department assignments, not proof of the actual sitting judge for a specific hearing. |
Linked cases
0 matching cases for this source filter.
No matching cases.
Source artifacts
| Artifact ID | Source ID | Artifact Type | Path | URL | Captured At |
|---|---|---|---|---|---|
| - | hf_sf_criminal_court | derived_public_dataset | - | - | - |
Full source record
| Access Mode | parquet_download |
| Cadence | snapshot; last observed scrape in research notes was 2026-03-20 |
| Coverage | Public Hugging Face parquet snapshot last modified 2026-05-04 with 13 tables: 77,406 cases, 776,728 ROA rows, 72,289 attorney rows, 318,993 calendar rows, 318,993 calendar+judge-assignment rows, 44,029 SFSC charge-disposition rows, 13,790 inferred SFSC case matches, DA arrest/prosecuted tables, and judicial assignment dimensions. The prototype now loads a bounded 120-case reference extract into the API: 120 cases, 4,079 docket rows, 452 attorney rows, 1,844 judge-enriched hearing rows, 340 charge-disposition rows, and 292 court-charge outcome rows. |
| Government Level | mixed_public_derived |
| ID | hf_sf_criminal_court |
| Layer | reference_enrichment |
| Name | jamiequint/sf_criminal_court Hugging Face Dataset |
| Owner | Independent public dataset compiler using SFSC and public agency sources |
| Profile Status | promoted_bounded_reference_extract |
| Canonical Records | court_case, court_charge, charge_disposition, prosecution_event, arrest_event, judge_assignment, source_record |
| Caveats | Hugging Face card declares CC-BY-NC-4.0; production/commercial use needs legal review or official-source re-derivation., Cases, ROA, attorneys, and calendar rows are scraped/derived reference data and should not outrank a fresh official SFSC scrape., Rule 10.500 charge-disposition rows are court-owned facts, but public case-number joins are deterministic-inferred because the court spreadsheet used anonymized IDs., Judge names on calendar rows are published department assignments, not proof of the actual sitting judge for a specific hearing. |
| Evidence | docs/research/master-findings.md, docs/research/enrichment-findings.md, artifacts/source-discovery/hf-sf-criminal-court.dataset.json, artifacts/source-discovery/hf-sf-criminal-court.README.md, artifacts/source-discovery/hf-sf-criminal-court.file-heads.json, artifacts/source-discovery/hf-sf-criminal-court.remote-parquet-profile.json, data/hf_sf_criminal_court_raw/hf_sf_criminal_court.json, data/hf_sf_criminal_court_raw/manifest.json, scripts/profile_hf_sf_criminal_court.py |
| Formats | Parquet |
| Join Keys | case_id, case_number, court_number, filed_date, charge_multiset, department, event_date |
| Known Endpoints | - |
| Rate Limit Notes | No live court traffic is required for this reference extract., HEAD probes showed all 13 parquet files total roughly 37 MB; DuckDB can profile them remotely or a batch job can cache them offline., Do not query Hugging Face per API request; use scheduled/offline refresh if license policy permits. |
| Source Urls | https://huggingface.co/datasets/jamiequint/sf_criminal_court |
Get this page with API
Rendered from the bluedoor SF Superior Court API. Reproduce it:
GET https://api.bluedoor.sh/sf-superior-court/v1/sources/hf_sf_criminal_courtJSONGET https://api.bluedoor.sh/sf-superior-court/v1/case-search?source_id=hf_sf_criminal_court&division=criminal&limit=25&include_facets=trueJSONGET https://api.bluedoor.sh/sf-superior-court/v1/source-artifactsJSON