Here we show an example of how to use the functions in the deepSSF package, showing data loading, model training and testing and generating simulations.
To install and use the package, visit the GitHub repository here: https://github.com/swforrest/deepSSF_package, which has instructions in the README. This script will not download with the package, but it can be accessed from the deepSSF_package repo, under the ‘examples’ folder (as a Jupyter notebook or a html file).
We use data that comes with the package in the datasets folder, which is the GPS tracking data of a single water buffalo (~10,000 locations), and two spatial layers, NDVI (the average from 2018-2019, which covers the temporal extent of the data), and slope.
Code
import mathfrom pathlib import Pathimport matplotlib.pyplot as pltimport numpy as npimport torchimport rasterio# Use this to load the functions used in this notebook# You can also just use import deepssf, and then call the functions with deepssf.function_nameimport deepssffrom deepssf import ( ConvJointModel, EarlyStopping, ModelParams, create_gif, fit, get_device, load_environmental_layers, make_dataloaders, make_optimisers, negativeLogLikeLoss, filter_steps_by_window, prepare_movement_df, simulate_trajectory, validate_next_step_probs,)import pandas as pdfrom datetime import datetime # Date/time utilitiesfrom IPython.display import HTML, display # For plotting GIFsfrom rasterio.plot import show # For plotting raster layersprint(f"deepssf {deepssf.__version__}")print(f"torch {torch.__version__}")
deepssf 0.1.2
torch 2.12.0
Setup the script
Set random seed and the device for training the model
If you have a GPU, this is where it will be specified (automatically).
Code
# Reproducibility — fix random seeds so results are the same on every run.torch.manual_seed(42)np.random.seed(42)# DEVICE is capitalised because it is a module-level constant (set once, never reassigned).# get_device() returns 'cuda' if a GPU is available, 'mps' on Apple Silicon,# or 'cpu' otherwise. All tensors and the model are moved to this device.DEVICE = get_device()print(f"Device: {DEVICE}")
Device: mps
Set script paths
If you have cloned/forked/downloaded the GitHub repo (https://github.com/swforrest/deepSSF_package) you shouldn’t need to change these, but for your own analyses you will want to set these to point to where your data is and where you want outputs saved.
All-caps names below are module-level constants — set once here and referenced throughout the notebook without being reassigned. Python has no built-in constant keyword, so capitalisation is the community convention (PEP 8) to signal “do not change this value”.
Code
DATA_DIR = Path("../src/deepssf/datasets/data") # location of the bundled sample dataCSV_PATH = DATA_DIR /"buffalo_djelk_id2005.csv"# GPS tracking data for individual 2005LAYER_PATHS = {"ndvi": str(DATA_DIR /"ndvi_2005.tif"), # Normalised Difference Vegetation Index"slope": str(DATA_DIR /"slope_2005.tif"), # terrain slope derived from a DEM}OUTPUT_DIR = Path("outputs") # figures and model checkpoints go hereSNAPSHOT_DIR = OUTPUT_DIR /"training_snapshots"# per-batch prediction-map snapshotsOUTPUT_DIR.mkdir(exist_ok=True)
Import data
The csv file is imported with a timestamp and an x and y coordinate. In our case the data was already in a projected format that is specific to Australia (EPSG:3112), which is the same crs as the raster data which will be imported.
Code
raw_df = pd.read_csv(CSV_PATH)print(f"Raw GPS fixes : {len(raw_df):,}")print(f"Individuals : {raw_df['id'].nunique()}")# Sort by the individual ID and timestamp to ensure correct temporal orderraw_df = raw_df.sort_values(by=["id", "time"]).reset_index(drop=True)# # Convert the 'time' column to the correct timezone (here Australia/Darwin) raw_df["time"] = ( pd.to_datetime(raw_df["time"], utc=True) .dt.tz_convert("Australia/Darwin") )raw_df.head()
raster_transform — Affine object that converts pixel (row, col) indices to the projected coordinates (easting, northing) of the raster(s), which should be in the same CRS as the GPS data. This transform is used whenever we need to translate between pixel space and real-world coordinates.
Converts a fixes dataframe (one row per GPS location) into a step dataframe (one row per movement between consecutive fixes).
For each step it computes:
dx, dy — displacement in projected CRS units (metres)
bearing — direction of travel (radians; 0 = north, clockwise positive)
bearing_tm1 — bearing of the previous step, used as a model input so the network can capture directional persistence by estimating a turning angle
dt_hour — time elapsed between consecutive fixes (hours)
hour_t1, yday_t1 — decimal hour-of-day and day-of-year at the step start location
*_sin1 / *_cos1 — sine/cosine encodings of hour and yday (see SCALAR_COLS below)
The trailing underscore on column names (x1_, y1_, t1_, etc.) flags them as “start of step” variables; x2_, y2_, t2_ are the corresponding “end of step” values, following the convention of the amt package (Signer et al. 2019). The last fix for each individual is dropped because no forward step can be formed.
Code
step_df = prepare_movement_df(raw_df)print(f"Movement steps : {len(step_df):,} ({len(raw_df) -len(step_df)} dropped — last fix per individual)")step_df.head()
Movement steps : 10,296 (1 dropped — last fix per individual)
id
t1_
x1_
y1_
t2_
x2_
y2_
dx
dy
bearing
bearing_tm1
dt_hour
hour_t1
yday_t1
hour_t1_sin1
hour_t1_cos1
yday_t1_sin1
yday_t1_cos1
0
2005
2018-07-25 00:04:02
41941.331695
-1.435875e+06
2018-07-25 01:04:23
41969.310875
-1.435671e+06
27.979180
204.064135
1.434536
0.000000
1.005833
0.066667
206.0
0.017452
0.999848
-0.391358
-0.920239
1
2005
2018-07-25 01:04:23
41969.310875
-1.435671e+06
2018-07-25 02:04:39
41921.521939
-1.435654e+06
-47.788936
16.857110
2.802478
1.434536
1.004444
1.066667
206.0
0.275637
0.961262
-0.391358
-0.920239
2
2005
2018-07-25 02:04:39
41921.521939
-1.435654e+06
2018-07-25 03:04:17
41779.439594
-1.435601e+06
-142.082345
53.568427
2.781049
2.802478
0.993889
2.066667
206.0
0.515038
0.857167
-0.391358
-0.920239
3
2005
2018-07-25 03:04:17
41779.439594
-1.435601e+06
2018-07-25 04:04:39
41841.203272
-1.435635e+06
61.763677
-34.322938
-0.507220
2.781049
1.006111
3.066667
206.0
0.719340
0.694658
-0.391358
-0.920239
4
2005
2018-07-25 04:04:39
41841.203272
-1.435635e+06
2018-07-25 05:04:27
41655.463332
-1.435604e+06
-185.739939
31.003534
2.976198
-0.507220
0.996667
4.066667
206.0
0.874620
0.484810
-0.391358
-0.920239
SCALAR_COLS
These are non-spatial inputs. For processing with convolutional layers, they are converted into spatial windows (rasters/arrays), and stacked with the truly spatial covariates.
The model receives a fixed-size spatial window centred on each step’s start location. To let the network express time-varying habitat preferences (e.g. different nocturnal vs diurnal use), scalar covariates are tiled into additional channels of the same H × W size, so every pixel “knows” the current time context when habitat suitability is estimated.
hour_t1_sin1 / hour_t1_cos1
Sine and cosine of (2π × hour / 24), encoding time-of-day as a circular variable. The circular encoding means 23:00 and 01:00 are treated as adjacent — a plain hour value would have an artificial discontinuity at midnight.
yday_t1_sin1 / yday_t1_cos1
Same circular encoding for day-of-year (1–365), capturing seasonal variation in habitat use (e.g. dry-season vs wet-season movements).
dt_hour
Elapsed time between consecutive fixes (hours). Because GPS collars do not always fix at a perfectly regular rate, including dt_hour lets the movement kernel rescale expected displacement proportionally to the actual time interval rather than assuming a fixed fix rate. Note: dt_hour must be supplied at simulation time as well, so that simulate_trajectory() produces inputs consistent with training.
The size of the local availability window in cells/pixels; 101 pixels × 25 m = 2525 m from the centre location to the edge
Code
WINDOW_SIZE =101
PIXEL_SIZE
Raster resolution in metres — must match the loaded GeoTIFFs
Code
PIXEL_SIZE =25
BATCH_SIZE
Number of training samples (movement steps) per training mini-batch
Code
BATCH_SIZE =32
OUTPUT_CHANNELS
Number of convolutional filters (and therefore feature maps) for each convolution layer. More convolutional filters can provide more flexible functions, but will increase the number of parameters and slow down training times.
Code
OUTPUT_CHANNELS =4
Pre-calculate the flattened CNN output size
This value will be the input to the fully-connected layers in the movement subnetwork, and is calculated here so that ModelParams can be set without hard-coding it.
The formula mirrors the layers inside ConvJointModel:
conv (padding=1) keeps the spatial dimension unchanged at each layer
maxpool k=2 - the number of max pooling layers (set in the model code and isn’t currently editable in this script)
s=2 halves the spatial dimension (floor division) at each layer
Code
dim = WINDOW_SIZEfor _ inrange(3): dim = math.floor((dim +2*1-3) /1+1) # conv (pad=1 preserves dim) dim = math.floor((dim -2) /2+1) # maxpool k=2, s=2DENSE_DIM = OUTPUT_CHANNELS * dim * dimprint(f"After 3× conv+maxpool: dim={dim} → dense_dim_in_all={DENSE_DIM}")
After 3× conv+maxpool: dim=12 → dense_dim_in_all=576
filter_steps_by_window()
Removes steps whose straight-line displacement exceeds half the spatial window (WINDOW_SIZE × PIXEL_SIZE / 2 metres). Such steps would place the true next location outside the predicted probability map, causing an index error in the loss function.
Code
step_df = filter_steps_by_window(step_df, window_size=WINDOW_SIZE, pixel_size=PIXEL_SIZE)print(f"Steps after window filter: {len(step_df):,}")
Steps after window filter: 10,101
make_dataloaders()
Splits step_df chronologically into train / val / test subsets (here 80 / 10 / 10 %) and returns a PyTorch DataLoader for each.
Layer 'ndvi': min=-0.06614641100168228, max=0.731696367263794 → scaled to [0, 1]
Layer 'slope': min=0.00013558653881773353, max=12.298083305358887 → scaled to [0, 1]
/Users/scottforrest/Library/Mobile Documents/com~apple~CloudDocs/github_repos/deepSSF_package/src/deepssf/data.py:366: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /Users/runner/work/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:219.)
self.scalar_to_grid_data = torch.from_numpy(
Train batches : 253
Val batches : 32
Test batches : 32
Inspect one batch to verify tensor shapes match ModelParams expectations
x1 has one channel per spatial layer (ndvi + slope = 2). The scalar covariates in x2 are broadcast to full H × W maps inside ConvJointModel and concatenated with x1 there, giving the CNN 2 + len(SCALAR_COLS) = 7 input channels total.
The px2 and py2 are the target that the model is trying to predict, and are the pixel values of the next step (with the centre cell being (50, 50))
next(iter(dl_train)) will call the next batch, which are randomly shuffled, so calling the function again will provide different training samples.
A thin dict-like container (it is a wrapper around something else rather than storing data) that groups all architecture hyper-parameters so they can be passed to ConvJointModel and logged together.
This is where the parameters that are passed to the model are set.
Code
params = ModelParams({"batch_size": BATCH_SIZE,"image_dim": WINDOW_SIZE, # spatial window side length (pixels)"pixel_size": PIXEL_SIZE, # metres per pixel"dim_in_nonspatial_to_grid": len(SCALAR_COLS), # scalar channels tiled into the window"dense_dim_in_nonspatial": len(SCALAR_COLS), # scalar inputs to the dense MLP branch"dense_dim_hidden": 64, # hidden units in the scalar MLP"dense_dim_in_all": DENSE_DIM, # flattened CNN output size (computed above)"input_channels": 2+len(SCALAR_COLS), # 2 env layers + scalar-to-grid channels"output_channels": OUTPUT_CHANNELS, # feature maps out of each conv layer"kernel_size": 3, # conv kernel spatial size"stride": 1,"kernel_size_mp": 2, # max-pool kernel size"stride_mp": 2,"padding": 1, # same-padding to preserve spatial dims through conv"num_movement_params": 12, # parameters of the movement kernel"dropout": 0.0,"device": DEVICE,})
ConvJointModel
The deepSSF model! A joint habitat–movement model with two heads: 1. Habitat head — a small CNN applied to the spatial window (env layers + tiled scalars) that outputs a per-pixel log-suitability map. 2. Movement head — a dense MLP (fully-connected layer) that takes the scalar covariates and the previous bearing, and produces parameters for a parametric movement kernel (a mixture of wrapped distributions centred on the current location).
The two log-probability surfaces are summed pixel-wise to give the joint predicted probability map for the animal’s next step.
Code
model = ConvJointModel(params).to(DEVICE)n_params =sum(p.numel() for p in model.parameters())print(f"Model parameters : {n_params:,}")
Model parameters : 43,009
negativeLogLikeLoss
Scores each predicted probability map against the true next-step pixel index; minimising it pushes probability mass toward observed locations.
Code
loss_fn = negativeLogLikeLoss(reduction="mean")
make_optimisers()
Creates two Adam optimisers with separate learning rates:
lr_habitat
controls the CNN (habitat) head; typically higher
lr_movement
controls the movement-kernel MLP; kept lower for stability
It also returns a ReduceLROnPlateau scheduler for each optimiser that reduces the learning rate by a factor of scheduler_factor (default = 0.1) if validation loss does not improve for scheduler_patience epochs (default = 5).
Monitors validation loss and saves the best model weights to disk whenever a new minimum is reached. Training stops automatically if the loss does not improve for patience consecutive epochs, preventing overfitting.
the name of the deepSSF model specified using ConvJointModel above
image_trim_pixels
how many outer pixel rings to trim from the output images before saving to prevent artifacts with the colour scaling due to the image padding being pulled into the image during the convolution layers.
dl_train
train dataloader (contains the training data)
dl_val
validation dataloader (contains the validation data used to assess the learning rate and early stopping)
loss_fn
loss function
optimisers
optimisers used to update the model weights
schedulers
learning-rate schedulers (used to update the learning rates)
early_stopping
stores the validation loss and stops the model training if it doesn’t improve for a given number of epochs (set using patience in the EarlyStopping definition)
snapshot_dir
where to save images of the training progress
snapshot_item
which sample (movement step) from the validation dataset to run the model on to generate the movement, habitat selection and next-step probability surfaces
Outputs:
history dict with keys train_losses and val_losses (one scalar value per completed epoch)
Every epoch, a plot of the training and validation loss, and the predicted probability maps (movement, habitat selection and next-step) are written to SNAPSHOT_DIR, which will be turned into a GIF in the next cell.
Convert the NumPy rasters loaded earlier into PyTorch tensors.
Both validate_next_step_probs() and simulate_trajectory() call get_landscape(month_index) at each step to fetch the current environmental layers as a list of 2-D tensors.
With static (non-seasonal) rasters the month_index argument is unused. For seasonal data (e.g. monthly NDVI composites) you would index into a stack of layers here based on the current month.
Code
ndvi_t = torch.from_numpy(env_layers["ndvi"].astype(np.float32))slope_t = torch.from_numpy(env_layers["slope"].astype(np.float32))def get_landscape(_month_index):# Returns layers in the same order as LAYER_PATHS.return [ndvi_t, slope_t]# Use the last 10 % of steps as the test set (matches the DataLoader split).n_test =int(len(step_df) *0.1)test_sample = step_df.iloc[-n_test:].reset_index(drop=True)print(f"Testing on {len(test_sample):,} steps")
Testing on 1,010 steps
validate_next_step_probs()
Runs the model forward on each test step and records the probability mass assigned to the true next location.
Inputs:
model
test_sample
test dataframe that contains movement steps (from the data loaded above)
get_landscape
retrieves the landscape layers (spatial covariates) that were used to train the model
transform
raster Affine transform
window_size
window size of the local availability window (should be the same as for training)
scalar_cols
scalar column names (should be set from above)
yday_col
which column has the yday column to ensure the correct spatial layers (if dynamic through time) are indexed
month_index_fn
a function mapping day-of-year → month index (used by the get_landscape function). Here we are using static spatial layers so we can set it to 0.
Outputs:
test_sample dataframe with three new columns appended:
habitat_prob — probability at the true location from the habitat head alone
move_prob — probability at the true location from the movement head alone
next_step_prob — joint (habitat × movement) probability at the true location
Higher values mean the model performed better (there was higher predicted probability at the observed next-step).
# Use the last 10 % of steps as the test set (matches the DataLoader split)n_test =int(len(step_df) *0.1)test_sample = step_df.iloc[-n_test:].reset_index(drop=True)print(f"Validating on {len(test_sample):,} steps")val_results = validate_next_step_probs( model, test_sample, get_landscape=get_landscape, transform=raster_transform, window_size=WINDOW_SIZE, scalar_cols=tuple(SCALAR_COLS), yday_col="yday_t1", bearing_col="bearing_tm1", month_index_fn=lambda _yday: 0, # static layers — month doesn't matter)print(val_results[["habitat_prob", "move_prob", "next_step_prob"]].describe())
Validating on 1,010 steps
habitat_prob move_prob next_step_prob
count 1010.000000 1010.000000 1010.000000
mean 0.000193 0.026397 0.035083
std 0.000179 0.051699 0.072018
min 0.000000 0.000000 0.000000
25% 0.000092 0.000240 0.000247
50% 0.000123 0.006742 0.007731
75% 0.000189 0.029893 0.036953
max 0.000675 0.254268 0.552189
Generates a sequence of steps by iteratively: 1. Extracting a spatial window centred on the current location. 2. Running the model to obtain a joint log-probability surface over that window. 3. Sampling the next pixel from that surface (softmax-normalised probabilities). 4. Converting the sampled pixel back to projected coordinates and advancing time.
Inputs:
model
the trained model
get_landscape
retrieves the landscape layers (spatial covariates) that were used to train the model
transform
raster Affine transform
start_x
x coordinate of starting location
start_y
y coordinate of starting location
n_steps
number of steps
starting_yday
starting day-of-year
starting_hour
starting hour-of-day
time_between_steps
time between steps in hours
window_size
size in pixels of the local landscape
month_index_fn
a function mapping day-of-year → month index (used by the get_landscape function). Here we are using static spatial layers so we can set it to 0.
Outputs: DataFrame with one row per simulated step containing:
x, y
— projected coordinates of the simulated location
hour, yday
— time variables updated at each step
month_index
the month index used to index the spatial variables (will be 0 if using static spatial covariates)
Signer, Johannes, John Fieberg, and Tal Avgar. 2019. “Animal Movement Tools (Amt): R Package for Managing Tracking Data and Conducting Habitat Selection Analyses.”Ecology and Evolution 9 (2): 880–90. https://doi.org/10.1002/ece3.4823.