Base run

A simple run of xagg, aggregating gridded temperature data over US counties. For a deeper dive into xagg’s functionality, see the Detailed Code Run.

[1]:
import xagg as xa
import xarray as xr
import numpy as np
import geopandas as gpd

Import

The sample data in this example are: - gridded: month-of-year average temperature projections for the end-of-century from a climate model (CCSM4) - shapefiles: US counties

[2]:
# Load some climate data as an xarray dataset
ds = xr.open_dataset('../../data/climate_data/tas_Amon_CCSM4_rcp85_monthavg_20700101-20991231.nc')
[3]:
# Load US counties shapefile as a geopandas GeoDataFrame
gdf = gpd.read_file('../../data/geo_data/UScounties.shp')

Aggregate

Now, aggregate the gridded variable in ds onto the polygons in gdf.

[4]:
# Calculate overlaps
weightmap = xa.pixel_overlaps(ds,gdf)
creating polygons for each pixel...
calculating overlaps between pixels and output polygons...
success!
[5]:
# Aggregate
aggregated = xa.aggregate(ds,weightmap)
adjusting grid... (this may happen because only a subset of pixels were used for aggregation for efficiency - i.e. [subset_bbox=True] in xa.pixel_overlaps())
grid adjustment successful
aggregating tas...
all variables aggregated to polygons!

Convert

Finally, convert the aggregated data back into the format you would like.

[6]:
# Example as an xarray dataset
ds_out = aggregated.to_dataset()
ds_out
[6]:
<xarray.Dataset>
Dimensions:     (month: 12, pix_idx: 3141)
Coordinates:
  * pix_idx     (pix_idx) int64 0 1 2 3 4 5 6 ... 3135 3136 3137 3138 3139 3140
  * month       (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
Data variables:
    NAME        (pix_idx) object 'Lake of the Woods' 'Ferry' ... 'Broomfield'
    STATE_NAME  (pix_idx) object 'Minnesota' 'Washington' ... 'Colorado'
    STATE_FIPS  (pix_idx) object '27' '53' '53' '53' ... '02' '02' '02' '08'
    CNTY_FIPS   (pix_idx) object '077' '019' '065' '047' ... '240' '068' '014'
    FIPS        (pix_idx) object '27077' '53019' '53065' ... '02068' '08014'
    tas         (pix_idx, month) float64 263.9 268.8 274.0 ... 283.5 276.4 270.4
[7]:
# Example as a pandas dataframe
df_out = aggregated.to_dataframe()
df_out
[7]:
NAME STATE_NAME STATE_FIPS CNTY_FIPS FIPS tas0 tas1 tas2 tas3 tas4 tas5 tas6 tas7 tas8 tas9 tas10 tas11
0 Lake of the Woods Minnesota 27 077 27077 263.918943 268.834073 273.977533 283.141960 290.623952 297.858885 302.068017 300.362248 293.471128 283.798660 275.109100 266.016176
1 Ferry Washington 53 019 53019 271.794169 275.631364 276.947080 279.837102 286.630023 293.769471 299.073178 297.151514 289.866690 281.648927 276.727886 272.256934
2 Stevens Washington 53 065 53065 272.113155 275.910279 277.355354 280.428965 287.247099 294.356788 299.847098 297.967740 290.637124 282.076344 277.019222 272.516056
3 Okanogan Washington 53 047 53047 271.772021 275.539162 276.654805 279.317270 285.794503 292.650947 297.741617 295.915714 289.090624 281.372544 276.598377 272.208944
4 Pend Oreille Washington 53 051 53051 271.721285 275.542011 276.993355 280.157156 287.086018 294.169635 299.503768 297.523382 290.086946 281.657134 276.644670 272.095152
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
3136 Skagway-Hoonah-Angoon Alaska 02 232 02232 270.709185 272.455135 273.717142 276.188285 281.253285 286.791100 288.361128 287.822862 284.093411 278.681980 274.221760 271.175471
3137 Yukon-Koyukuk Alaska 02 290 02290 263.970656 263.404975 266.670047 272.394716 280.492861 288.813169 288.513645 285.724033 280.243361 273.044271 266.155923 265.022613
3138 Southeast Fairbanks Alaska 02 240 02240 262.846312 263.000185 265.438037 270.754788 278.476096 286.669566 287.315147 284.920161 279.230840 271.713061 264.946526 263.297936
3139 Denali Alaska 02 068 02068 265.084342 264.547936 267.203954 271.782649 278.898267 287.059920 287.375217 285.069283 279.833609 272.514117 266.145088 265.682660
3140 Broomfield Colorado 08 014 08014 270.803864 273.430206 275.955505 280.790070 287.303619 292.830048 297.615662 297.646820 292.368988 283.544708 276.383606 270.444855

3141 rows × 17 columns

[ ]: