Metadata-Version: 2.4
Name: cityscope
Version: 0.4.0
Summary: Real estate investment research data pipeline for US metros and cities
Project-URL: Homepage, https://github.com/kkarbasi/cityscope
Project-URL: Repository, https://github.com/kkarbasi/cityscope
Project-URL: Issues, https://github.com/kkarbasi/cityscope/issues
Author-email: Kaveh Karbasi <kkarbasi@berkeley.edu>
License-Expression: MIT
License-File: LICENSE
Keywords: bls,census,data-pipeline,real-estate,research,urban
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Office/Business :: Financial :: Investment
Requires-Python: >=3.11
Requires-Dist: click>=8.3.2
Requires-Dist: httpx>=0.28.1
Requires-Dist: openpyxl>=3.1.5
Requires-Dist: pandas>=3.0.2
Requires-Dist: pydantic>=2.13.1
Requires-Dist: pyyaml>=6.0.3
Requires-Dist: rich>=15.0.0
Description-Content-Type: text/markdown

<p align="center">
  <img src="https://raw.githubusercontent.com/kkarbasi/cityscope/main/assets/logo.svg" alt="Cityscope" width="420">
</p>

<p align="center">
  <strong>Find your next real estate market in minutes, not months.</strong>
</p>

<p align="center">
  <img src="https://img.shields.io/badge/python-3.13+-3776ab?logo=python&logoColor=white" alt="Python 3.13+">
  <img src="https://img.shields.io/badge/data-Census%20%7C%20BLS-4f46e5" alt="Data Sources">
  <img src="https://img.shields.io/badge/storage-SQLite-003b57?logo=sqlite&logoColor=white" alt="SQLite">
  <img src="https://img.shields.io/pypi/v/cityscope?color=6366f1&cache=v1" alt="PyPI">
  <img src="https://img.shields.io/github/license/kkarbasi/cityscope" alt="License">
</p>

---

A pip-installable Python package that pulls public government data about every major US metro and city — population growth, job growth, wages, unemployment — and stores it locally in SQLite for analysis. Use it as a **library**, from the **CLI**, or build your own UI on top.

## Install

```bash
pip install cityscope
```

Or with [uv](https://docs.astral.sh/uv/):

```bash
uv add cityscope
```

## Python API

```python
from cityscope import api

# Fetch data from public sources
api.fetch("census_population")
api.fetch("bls_employment", skip_laus=True)

# Query as a DataFrame
df = api.to_dataframe(metric="population_change_pct", geo_type="metro", year=2024)
print(df.sort_values("value", ascending=False).head(10))

# Or as a list of dicts
rows = api.query(metric="employment_change_pct", geo_type="metro", year=2024, limit=10)
for row in rows:
    print(f"{row['name']}: {row['value']:+.1f}% job growth")

# Look up stats for a specific address
report = api.lookup("1600 Amphitheatre Pkwy, Mountain View, CA", auto_fetch=True)
print(f"Metro: {report.metro.name} — {report.metro.metrics}")
print(f"City:  {report.city.name} — population {report.city.population:,}")
print(f"County: {report.county.name}")

# Configure (optional — works with defaults)
api.configure(db_path="my_data.db", min_population=100_000)

# Check what you have
api.status()
api.list_sources()
api.get_geographies(geo_type="metro", min_population=500_000)
```

## CLI

```bash
# Fetch data
cityscope fetch census_population
cityscope fetch bls_employment --skip-laus
cityscope fetch --all

# Query
cityscope query -m population_change_pct -g metro -y 2024
cityscope query -m employment_change_pct -g metro -y 2024 -n 10
cityscope query -m avg_annual_pay -y 2024 -n 15

# Look up stats for an address
cityscope lookup "1600 Amphitheatre Pkwy, Mountain View, CA"
cityscope lookup "123 Main St, Austin, TX" --auto-fetch

# Info
cityscope sources
cityscope status
```

### Commands

| Command | Description |
|---|---|
| `fetch <source>` | Pull data from a source (`census_population`, `bls_employment`) |
| `fetch --all` | Pull from all sources |
| `query` | Query stored data (`-m`, `-g`, `-y`, `--min-pop`, `-n`) |
| `lookup <address>` | Look up stats for a US address (metro + city + county) |
| `sources` | List available data sources |
| `status` | Show fetched data summary |
| `init-config` | Generate default `config/settings.yaml` |

### Address Lookup

The `lookup` command geocodes a US address (via the free Census Geocoder) and returns stats for the enclosing metro, city, and county:

```bash
cityscope lookup "1600 Amphitheatre Pkwy, Mountain View, CA" --auto-fetch
```

With `--auto-fetch`, cityscope will fall back to fetching missing data from source APIs on-the-fly (useful for addresses in counties or smaller cities not covered by the default 200k+ metro/city fetch). Results are cached in the local DB.

Global flags: `-v` (verbose logging), `-c PATH` (custom config file).

### Fetch Flags

| Flag | Description |
|---|---|
| `--vintage YEAR` | Override Census vintage year |
| `--min-pop N` | Override population filter (default: 200,000) |
| `--skip-laus` | Skip unemployment rate (avoids BLS API daily limit) |

## What Data You Get

**370+ metros and cities** (200k+ population), each tracked across **9 metrics** over **5 years** (2020–2024):

| Metric | Source | Description |
|---|---|---|
| `population` | Census PEP/ACS | Total population |
| `population_change_pct` | Census PEP/ACS | Year-over-year population growth % |
| `employment` | BLS QCEW | Total nonfarm jobs |
| `employment_change_pct` | BLS QCEW | Year-over-year job growth % |
| `avg_annual_pay` | BLS QCEW | Average annual pay ($) |
| `avg_weekly_wage` | BLS QCEW | Average weekly wage ($) |
| `unemployment_rate` | BLS LAUS | Annual avg unemployment rate (%) |

All data is **free, public domain**, pulled directly from federal APIs.

## Configuration

Optional — everything works with defaults. For higher API limits:

```yaml
# config/settings.yaml
census:
  api_key: null    # Free: https://api.census.gov/data/key_signup.html
bls:
  api_key: null    # Free: https://data.bls.gov/registrationEngine/
storage:
  db_path: data/cityscope.db
pipeline:
  min_population: 200000
```

Or configure programmatically:

```python
api.configure(census_api_key="your_key", bls_api_key="your_key")
```

## Architecture

```
Census API ──┐                     ┌── Python API (cityscope.api)
             ├── Pipeline ── SQLite DB ──┤
BLS QCEW  ──┘    (fetch)          └── CLI (cityscope)
```

Data flows: **Source → Pipeline → SQLite → API/CLI/your code**.

## Adding Data Sources

Each source is a self-contained class with `@SourceRegistry.register`:

```python
from cityscope.core.registry import SourceRegistry
from cityscope.core.source import DataSource
from cityscope.core.models import FetchResult

@SourceRegistry.register
class MySource(DataSource):
    source_id = "my_source"
    name = "My Data Source"
    description = "What it provides"

    def fetch(self, **kwargs) -> FetchResult:
        ...
```

Add the import to `src/cityscope/sources/__init__.py` — it auto-registers in CLI, API, and pipeline.

## Dashboard

For a visual dashboard, see [urban-research-ui](https://github.com/kkarbasi/urban-research-ui).

## Roadmap

- [ ] Rent data (HUD Fair Market Rents, Zillow ZORI)
- [ ] Home price index (FHFA HPI)
- [ ] Crime stats (FBI Crime Data Explorer)
- [ ] School quality (NCES)
- [ ] Walkability (EPA Smart Location Database)
- [ ] Migration flows (IRS SOI county-to-county)
- [ ] Neighborhood-level data (Census tract)
- [ ] Composite scoring engine

See [`data_sources.md`](data_sources.md) for research on 50+ public data sources.

## Contributing

Pull requests welcome. The easiest way to contribute is adding a new data source — the plugin architecture makes it straightforward.

## License

MIT

---

Built with [Claude Code](https://claude.ai/claude-code) (Claude Opus 4.6).
