Give Claude Code research-grade access to federal data — Census, BLS, FHFA, BEA and more — with full provenance, pinned vintages, and reproducible queries. No scraping. No hallucinations.
$ curl -s https://govdata.ai/install | bash
Pull a pre-materialized cross-agency bundle, expose it to Claude over MCP, ask questions in plain English.
Open kernel. MIT licensed. No account required for local use.
# add to your project $ pip install substrate
Versioned, sha256-verified DuckDB file. ~2 MB for state_panel.
$ substrate pull state_panel # → state_panel.duckdb # 5 tables · 419 rows · signed manifest
MCP over stdio. Add one entry to your Claude config.
$ substrate serve state_panel.duckdb # MCP tools exposed: # list_tables, describe_table, # query_table, aggregate, run_sql
| State | Population | Median income | HPI |
|---|---|---|---|
| California | 38,965,193 | $96,334 | 880.4 |
| Texas | 30,503,301 | $76,958 | 505.7 |
| Florida | 22,610,726 | $71,711 | 773.6 |
| New York | 19,571,216 | $84,578 | 977.3 |
| Pennsylvania | 12,961,683 | $76,081 | 585.8 |
| Massachusetts | 7,001,399 | $99,858 | 1149.3 |
Curated cross-agency bundles for instant insight, plus raw single-source bundles
so you can build your own. Every bundle uses canonical join keys —
state_fips,
year,
naics — so they ATTACH and JOIN
without translation logic.
Census + BLS + FHFA + BEA at state grain. Population, income, employment, home prices, per-cap PI.
$ substrate pull state_panel
Census ACS + CDC PLACES (6 health measures) + Opportunity Insights mobility + USDA food environment, joined at county FIPS.
$ substrate pull county_profile
FHFA HPI + Census ACS income + HUD Fair Market Rents + IRS migration at CBSA grain. 410 metros, with computed price-to-income and rent-burden ratios.
$ substrate pull metro_affordability
CDC PLACES — 13 chronic-disease prevalence measures pivoted into one wide table at county FIPS. Obesity, diabetes, heart disease, mental health, COPD, depression and more.
$ substrate pull county_health
Census ACS 1-Year, state × year (2019, 2021–2023). Population, median household income, and poverty rate. Useful for tracking pandemic-era state trajectories.
$ substrate pull population_panel
DOE LEAD Tool — energy burden (% of household income spent on energy) by county and income bracket across CA, TX, NY, FL, MA. Useful for finding communities where energy costs are a material share of income.
$ substrate pull energy_burden
Mortgage lending activity aggregated by state and year from HMDA. Counts and dollar volumes for originated, approved, and denied applications. Tax year 2022.
$ substrate pull hmda_lending
Zillow Observed Rent Index (ZORI) by metro × year, paired with the Wharton Residential Land Use Regulation Index (WRLURI). Lets you see rent growth alongside the regulatory friction explaining it.
$ substrate pull rent_trends
Raw BLS QCEW: state employment, wages, establishments. Native columns preserved + canonical state_fips/year/naics.
$ substrate pull bls_qcew_state
Raw Census ACS 1-year: population, median income, home value, rent, poverty, labor force. 8 variables, all states.
$ substrate pull census_acs1_state
Raw FHFA House Price Index, state × quarter for 2023. All-transactions index from FHFA.
$ substrate pull fhfa_hpi_state
Raw BEA Regional CAINC1: state-level personal income for 2023. 51 state rows, native + canonical columns.
$ substrate pull bea_personal_income_state
Raw EIA retail electricity prices, state × period × sector. Time series back to 2001. Useful for energy-cost analysis.
$ substrate pull eia_retail_electricity_state
Raw FRED: 24 headline US macro series back to 2000 — UNRATE, CPI, GDP, mortgage rates, jobless claims, MSPUS, fed funds, treasuries.
$ substrate pull fred_macro_indicators
Raw EIA electricity generation, state × source × sector × period back to 2001. Useful for tracking generation-mix shifts.
$ substrate pull eia_generation_state
Raw CDC PLACES, age-adjusted diabetes prevalence at county grain. Standalone single-measure bundle for à-la-carte querying.
$ substrate pull cdc_places_diabetes
Raw FEMA federal disaster declarations across 50 states + DC. One row per (disaster, state, county) with incident type, dates, and which assistance programs were activated.
$ substrate pull fema_disasters_state
Raw IRS SOI county-to-county migration flows for tax year 2022. One row per (origin_county, dest_county, direction). Inflow/outflow taxpayers, exemptions, and AGI.
$ substrate pull irs_county_migration
Raw USA Spending federal awards by state for FY2024. Total $ awarded, population, per-capita. Useful for federal $ flow analysis.
$ substrate pull usaspending_state
Raw USDA Food Environment Atlas: county-level food access, food insecurity, grocery vs fast-food density, SNAP participation, obesity + diabetes.
$ substrate pull usda_food_environment_county
Raw CMS Medicare enrollment, state × month panel for 2023. Total beneficiaries split between original Medicare and Medicare Advantage.
$ substrate pull cms_medicare_enrollment_state
substrate doctor.
Vote on what's next →
Other "stock-market-data-for-AI" services hand you a number. govdata hands you a number plus the connector that pulled it, the timestamp, the params, the SQL hash that joined it, and the cost of any LLM extraction. Reproducible by construction.
Every published bundle ships with a manifest recording per-source vintages and the assertion results that gated the publish. Pin a version, re-run a query in 2030, get the same number.
{
"name": "state_panel",
"version": "lifecycle-test",
"sha256": "a578954e62…",
"byte_size": 2109440,
"total_cost_usd": 0.0,
"vintages": {
"state_census_acs1": {
"connector": "census",
"synced_at": "2026-04-25T21:46:39",
"params": {
"geography": "state",
"year": 2023,
"variables": ["B01003_001E", "B19013_001E"]
}
},
"state_bls_qcew": { /* … */ },
"state_fhfa_hpi": { /* … */ },
"state_bea_income": { /* … */ }
},
"asserts": [
{ "name": "state_panel_row_count", "passed": true },
{ "name": "state_fips_well_formed", "passed": true },
{ "name": "population_present", "passed": true },
{ "name": "median_income_plausible", "passed": false,
"severity": "warn", "failing_count": 1 }
]
}
Three commands, one config block. No account required for the open kernel.
$ curl -s https://govdata.ai/install | bash
One command. Installs uv + substrate, drops the
state_panel bundle into ~/.govdata/, and registers
an MCP server with Claude Code (or Claude Desktop).
Restart your client. Done.
$ pip install substrate
$ substrate pull state_panel → ./state_panel.duckdb 5 tables · 419 rows · sha256 ✓
$ claude mcp add govdata --scope user \
substrate -- serve ./state_panel.duckdb