Tutorial 3: Multi-Year Comparisons¶
cendat makes it easy to compare data across multiple years.
Goal: Find Colorado incorporated places with very low poverty rates across recent years.
Setup¶
In [ ]:
Copied!
import os
from cendat import CenDatHelper
from dotenv import load_dotenv
load_dotenv()
cdh = CenDatHelper(key=os.getenv("CENSUS_API_KEY"))
# Request multiple years at once
cdh.list_products(years=[2020, 2021, 2022, 2023], patterns=r"acs/acs5\)")
cdh.set_products()
import os
from cendat import CenDatHelper
from dotenv import load_dotenv
load_dotenv()
cdh = CenDatHelper(key=os.getenv("CENSUS_API_KEY"))
# Request multiple years at once
cdh.list_products(years=[2020, 2021, 2022, 2023], patterns=r"acs/acs5\)")
cdh.set_products()
Step 2: Select Data¶
In [ ]:
Copied!
cdh.set_groups(["B17001"])
cdh.set_geos(["160"]) # Places
cdh.set_groups(["B17001"])
cdh.set_geos(["160"]) # Places
Step 3: Get Data¶
In [ ]:
Copied!
# Filter to Colorado (state = 08)
response = cdh.get_data(
include_names=True,
within={"state": "08"}
)
# Filter to Colorado (state = 08)
response = cdh.get_data(
include_names=True,
within={"state": "08"}
)
Step 4: Complex Filtering with Tabulate¶
The where parameter supports multiple conditions:
In [ ]:
Copied!
response.tabulate(
"NAME",
"B17001_002E", # Below poverty
"B17001_001E", # Total
where=[
"B17001_001E > 1_000", # Population > 1,000
"B17001_002E / B17001_001E < 0.01", # Poverty rate < 1%
"'CDP' not in NAME", # Exclude CDPs
],
weight_var="B17001_001E",
strat_by="vintage" # Separate results by year
)
response.tabulate(
"NAME",
"B17001_002E", # Below poverty
"B17001_001E", # Total
where=[
"B17001_001E > 1_000", # Population > 1,000
"B17001_002E / B17001_001E < 0.01", # Poverty rate < 1%
"'CDP' not in NAME", # Exclude CDPs
],
weight_var="B17001_001E",
strat_by="vintage" # Separate results by year
)
Tip - Condition Syntax
The
whereparameter supports:
- Simple comparisons:
"AGEP > 17"- Division expressions:
"B17001_002E / B17001_001E < 0.01"- String containment:
"'CDP' not in NAME"