Metadata-Version: 2.4
Name: utah-housing
Version: 0.1.0
Summary: Census ACS data pull and fixed effects models for Utah housing research
License: MIT
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.31
Requires-Dist: pandas>=2.0
Requires-Dist: numpy>=1.26
Requires-Dist: statsmodels>=0.14
Requires-Dist: linearmodels>=5.3
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: ipykernel; extra == "dev"
Requires-Dist: jupyterlab; extra == "dev"
Dynamic: license-file

# utah-housing

Census ACS 5-year data fetcher and fixed effects models for Utah housing research.

## Setup

```bash
pip install -e ".[dev]"
export CENSUS_API_KEY=your_key_here  # https://api.census.gov/data/key_signup.html
```

## Usage

### Pull data

```python
from utah_housing import fetch_all_years

df = fetch_all_years(years=range(2009, 2024))
df.to_csv("utah_housing_2009_2023.csv", index=False)
```

Pull a single year:

```python
from utah_housing import fetch_year

df_2022 = fetch_year(2022)
```

### Run models

```python
import pandas as pd
from utah_housing import run_base_model, run_complex_model, compare_models

df = pd.read_csv("utah_housing_2009_2023.csv")

result1, coef1 = run_base_model(df)
result2, coef2 = run_complex_model(df)

comparison = compare_models(coef1, coef2)
coef1.to_csv("base_results.csv")
coef2.to_csv("complex_results.csv")
```

### Run diagnostics

```python
from utah_housing.models import run_diagnostics
from utah_housing import COMPLEX_PREDICTORS

run_diagnostics(df, COMPLEX_PREDICTORS)
```

## Models

| Model | Fixed effects | Extra predictor |
|-------|--------------|-----------------|
| Base  | tract + year | — |
| Complex | tract + county×year | `pop_in_occupied_total` |

**Outcome:** `median_owner_costs_with_mortgage`

**Predictors:**
- `pct_sf_renter_occupied` — share of single-family homes that are renter-occupied (investor proxy)
- `median_household_income` — demand-side income
- `owner_renter_income_gap` — income stratification signal
- `pct_vacant` — market slack

## Data tables pulled

| Table | Description |
|-------|-------------|
| B25024 | Units in structure |
| B25001 | Total housing units |
| B25002 | Occupancy status |
| B25003 | Tenure (owner vs. renter) |
| B25008 | Population in occupied housing by tenure |
| B25119 | Median household income by tenure |
| B25032 | Units in structure by tenure |
| B25088 | Median monthly owner costs |
| B19013 | Median household income |

## Package layout

```
utah_housing/
├── __init__.py      # public API
├── variables.py     # all ACS variable lists, rename map, model variable sets
├── fetch.py         # Census API fetcher
└── models.py        # fixed effects models + diagnostics
```
