GeoCS package#

Submodules#

GeoCS.calc_lib module#

Created on Thu Mar 28 10:08:54 2024

@author: schoelleh96

GeoCS.calc_lib.calc_bounds(x, y, z, timesteps, convex, alpha=None)#

Calculate Boundary.

If Convex=True, find Convex Hull, else calculate alpha shape for given alpha or estimate optimal alpha else.

Parameters:
  • x (np.ndarray) – coordinate.

  • y (np.ndarray) – coordinate.

  • z (np.ndarray) – coordinate.

  • timesteps (np.ndarray) – timesteps belonging to axis 1 of the coords.

  • convex (bool) – whether to find convex of concave bounding hull.

  • alpha (Optional[float], optional) – alpha parameter. The default is None.

Returns:

  1. bounds: Mapping from timesteps to boundary flags

    (True if inside or on boundary).

  2. hulls: Mapping from timesteps to the Trimesh object representing

    the hull.

Return type:

Tuple[dict]

GeoCS.calc_lib.calc_diff_map(eps, is_bound, N_v, n_traj, dates, dist_mats=None, calc_dist=None)#

Calculate diffusion maps.

Diffusion maps: eigenvectors of the averaged diffusion transition matrix)

along with its eigenvalues.

Parameters:
  • eps (float) – diffusion bandwidth.

  • is_bound (Dict[datetime, np.typing.NDArray[bool]]) – indicates boundary points at each timestep.

  • N_v (int) – how many eigenvalues and -vectors to compute.

  • n_traj (int) – number of trajectories.

  • dates (np.typing.NDArray[datetime]) – The timesteps of the trajectories.

  • dist_mats (Optional[Dict[datetime, sps.csr_matrix]], optional) – Dictonary mapping dates to distance matrices. The default is None.

  • calc_dist (Optional[Callable[[datetime], sps.csr_matrix]], optional) – Function handle to a function returning distance matrices given a date. The default is None.

Return type:

Tuple[ndarray[Any, dtype[float]], ndarray[Any, dtype[float]]]

Returns:

  • vals (np.typing.NDArray[float]) – Eigenvalues.

  • vecs (np.typing.NDArray[float]) – Eigenvectors (the diffusion maps).

GeoCS.calc_lib.calc_dist(lon, lat, z, r, k)#

Calculate pointwise distances given positions on earth.

Parameters:
  • lon (ndarray) – longitudes.

  • lat (ndarray) – latitudes.

  • z (ndarray) – vertical coordinate.

  • r (float) – cut-off radius in km.

  • k (float) – scaling parameter bringing vertical coordinate to horizontal coordinate value range.

Returns:

lower triangle of point-wise distance matrix.

Return type:

scipy.sparse.csr_matrix

GeoCS.calc_lib.calc_k(u, v, w)#

Calculate the velocity-based scaling parameter.

Parameters:
  • u (ndarray) – Horizontal velocity in the x direction.

  • v (ndarray) – Horizontal velocity in the y direction.

  • w (ndarray) – Vertical velocity.

Returns:

Scaling parameter.

Return type:

float

GeoCS.calc_lib.kmeans(E_vecs, N_k)#

Cluster using kmeans algorithm from scikit-learn.

Parameters:
  • E_vecs (np.typing.NDArray[float]) – Coordinates which to cluster(eigenvectors).

  • N_k (int) – Number of clusters.

Returns:

kmeans object.

Return type:

KMeans

GeoCS.data_lib module#

Created on Wed Mar 27 11:10:26 2024

@author: schoelleh96

class GeoCS.data_lib.Bound(data_path, k, convex, alpha=None, start_date=None, traj_data=None)#

Bases: Data

Represents boundaries point clouds belonging to trajectory data.

Uses either convex hulls or alpha shapes.

Parameters:
  • data_path (str) – The file path for storing and loading boundary data.

  • k (float) – The scaling parameter used in distance calculations.

  • convex (bool) – Specifies whether to use convex hulls (True) or alpha shapes (False).

  • alpha (Optional[float], default=None) – The alpha parameter for alpha shape calculation. If None and using alpha shapes, an optimal alpha will be estimated.

  • start_date (Optional[datetime], default=None) – The start date for the data analysis period.

  • traj_data (Optional[Traj], default=None) – An instance of the Traj class containing trajectory data to be analyzed

_hulls#

Stores the hull (either convex or alpha shape) for each timestep.

Type:

Dict[datetime, Trimesh]

_is_bound#

Indicates whether points are within the boundary for each timestep.

Type:

Dict[datetime, np.ndarray]

_projection#

The cartopy coordinate reference system used for data projection.

Type:

Optional[cartopy.crs]

property alpha: float | None#
calc_bounds()#
Return type:

Tuple[Dict[datetime, ndarray], Dict[datetime, Trimesh]]

calc_or_load(convex, alpha)#
Parameters:
Return type:

Tuple[Dict[datetime, ndarray], Dict[datetime, Trimesh]]

property convex: bool#
property hulls: Dict[datetime, Trimesh]#
property is_bound: Dict[datetime, ndarray[Any, dtype[bool]]]#
property k: float#
load()#

Load data. Implementation required.

Return type:

None

plot(**kwargs)#

Plot default Boundary plot. Invokes plot_bound.

Parameters:

**kwargs (TYPE) – DESCRIPTION.

Return type:

Tuple[Figure, Axes]

Returns:

  • fig (matplotlib figure)

  • ax (matplotlib ax)

plot_bound(**kwargs)#

Plot an interactive widget to demonstrate boundary detection.

Parameters:

**kwargs (TYPE) – DESCRIPTION.

Return type:

Tuple[Figure, Axes]

Returns:

  • fig (matplotlib figure)

  • ax (matplotlib ax)

save()#

Save data. Implementation required.

Return type:

None

property traj_data: Traj | None#
property x: ndarray[Any, dtype[float]]#
property y: ndarray[Any, dtype[float]]#
class GeoCS.data_lib.Data(data_path, start_date)#

Bases: ABC

Abstract base class for all kinds of data in this package.

_data_path(str)#
Type:

Path where the data is stored or to be stored.

_start_date(datetime)#
Type:

The start date of the trajectories.

_n_traj(Optional[int])#
Type:

Number of trajectories, initialized to None.

_n_steps(Optional[int])#
Type:

Number of time steps, initialized to None.

_dt(Optional[datetime])#
Type:

Time step size, initialized to None.

Parameters:
property data_path: str#
property dt#
abstract load()#

Load data. Implementation required.

Return type:

None

property n_steps#
property n_traj#
abstract plot()#

Plot data. Implementation required.

Return type:

None

abstract save()#

Save data. Implementation required.

Return type:

None

property start_date: datetime#
class GeoCS.data_lib.DiffMap(data_path, eps, N_v=20, N_cs=6, start_date=None, bound_data=None, dist_data=None)#

Bases: Data

Diffusion maps for trajectory data to identify coherent sets.

Parameters:
  • data_path (str) – Path where the computed diffusion map results are stored or will be stored.

  • eps (float) – The epsilon parameter controlling the diffusion process scale.

  • N_v (Optional[int], default=20) – Number of eigenvectors (and corresponding eigenvalues) to compute.

  • N_cs (Optional[int], default=6) – Number of coherent sets to identify from the diffusion map.

  • start_date (Optional[datetime], default=None) – Starting date for analyzing the trajectory data.

  • bound_data (Optional[Bound], default=None) – An instance of the Bound class containing boundary data.

  • dist_data (Optional[Dist], default=None) – An instance of the Dist class containing distance data between points.

_E_vals#

The eigenvalues computed from the diffusion map.

Type:

np.ndarray or None

_E_vecs#

The eigenvectors computed from the diffusion map.

Type:

np.ndarray or None

calc_diff_map(eps)#
Parameters:

eps (float) –

Return type:

Tuple[ndarray[Any, dtype[float]], ndarray[Any, dtype[float]]]

calc_or_load(eps)#
Parameters:

eps (float) –

Return type:

Tuple[ndarray[Any, dtype[float]], ndarray[Any, dtype[float]]]

cluster_cs(N_cs)#
Parameters:

N_cs (int) –

Return type:

ndarray[Any, dtype[int]]

property file_path#
load()#

Load data. Implementation required.

Return type:

None

plot(**kwargs)#

Plot default coherent set plot. Invokes plot_cs.

Parameters:

**kwargs (TYPE) – DESCRIPTION.

Return type:

Tuple[Figure, Axes]

Returns:

  • fig (matplotlib figure)

  • ax (matplotlib ax)

plot_cs(**kwargs)#

Plot interactive widget to analyze behaviour of coherent set detection.

Parameters:

**kwargs (TYPE) – DESCRIPTION.

Return type:

Tuple[Figure, Axes]

Returns:

  • fig (matplotlib figure)

  • ax (matplotlib ax)

save()#

Save data. Implementation required.

Return type:

None

class GeoCS.data_lib.Dist(data_path, r, k, start_date=None, traj_data=None, save_pattern='%Y%m%d_%H%M.npz')#

Bases: Data

A class for handling pairwise distances of trajectories.

_data_path(str)#
Type:

Path where the data is stored or to be stored.

_start_date(datetime)#
Type:

The start date of the trajectories.

_r(float)#
Type:

The cut-off radius for distance calculations.

_k(float)#
Type:

vertical scaling parameter

_save_pattern(str)#

with datetime formatting symbols (e.g., “%Y%m%d_%H%M%S.npz”).

Type:

The pattern used for saving distance matrix files,

_mats(Dict[datetime, sps.csr_matrix])#

timestep to its corresponding sparse distance matrix triangle.

Type:

A dictionary mapping each

_mat_paths(List[str])#
Type:

The list of file paths for the matrices.

_traj_data(Optional[traj_data])#

which pairwise distances can be calculated if not loading.

Type:

An optional traj_data object from

Parameters:
calc_dist(date_key=None, timestep=None)#
Parameters:
Return type:

csr_matrix

calc_or_load(date_key)#
Parameters:

date_key (Optional[datetime]) –

Return type:

csr_matrix

property k: float#
load()#

Load all available distance matrices. Caution for large data.

Returns:

DESCRIPTION.

Return type:

None

load_mat(full_path)#
Parameters:

full_path (str) –

Return type:

csr_matrix

property mat_paths: list#
property mats: dict#
plot(**kwargs)#

Plot default Distances plot. Invokes plot_dist_hist.

Parameters:

**kwargs (TYPE) – DESCRIPTION.

Return type:

Tuple[Figure, Axes]

Returns:

  • fig (matplotlib figure)

  • ax (matplotlib ax)

plot_dist_hist(bin_count=100, **kwargs)#

Plot histogram of distances.

Parameters:
  • **kwargs (TYPE) – DESCRIPTION.

  • bin_count (Optional[int]) –

Return type:

Tuple[Figure, Axes]

Returns:

  • fig (matplotlib figure)

  • ax (matplotlib ax)

property r: float#
save()#

Save distance matrix for all dates. Caution for large data.

Returns:

DESCRIPTION.

Return type:

None

save_mat(mat, matPath)#
Parameters:
  • mat (csr_matrix) –

  • matPath (str) –

Return type:

None

property save_pattern: str#
property traj_data: Traj | None#
class GeoCS.data_lib.Traj(data_path, start_date)#

Bases: Data

A class for handling trajectory data.

_data_path(str)#
Type:

The path where the data is stored or to be stored.

_start_date(datetime)#
Type:

The start date of the trajectories.

_extent(Optional[List[float]])#

Defaulting to the entire globe ([-180, 180, -90, 90]).

Type:

The axes extent for plotting.

_projection(cartopy.crs.Projection)#

Defaulting to Mercator projection.

Type:

Map projection used for plotting.

_trajs(Optional[np.ndarray])#
Type:

The trajectory data as a NumPy array.

_k(Optional[float])#
Type:

Empirical scaling parameter.

Parameters:
property extent#
property k#

Scaling parameter. Assumes U and V are in m/s and Omega is in P/s.

Returns:

In km/hPa.

Return type:

float

load()#

Load trajectory data (npy) from file specified in data_path.

Return type:

None.

plot(**kwargs)#

Plot default Trajectory plot. Invokes plot2D.

Parameters:

**kwargs (TYPE) – DESCRIPTION.

Return type:

Tuple[Figure, GeoAxes]

Returns:

  • fig (matplotlib figure)

  • ax (matplotlib ax)

plot_2d(**kwargs)#

Plot simple 2D trajectory plot.

Parameters:

**kwargs (TYPE) – DESCRIPTION.

Return type:

Tuple[Figure, GeoAxes]

Returns:

  • fig (matplotlib figure)

  • ax (matplotlib ax)

property projection#
save()#

Save trajectory data (npy) to file specified in data_path.

Return type:

None.

property trajs#

GeoCS.plot_lib module#

Created on Wed Mar 27 15:09:03 2024

@author: schoelleh96

class GeoCS.plot_lib.BoundVisualizer(x, y, z, get_bound, convex, initial_time_index=0, alpha=None)#

Bases: PointCloudVisualizer

A visualizer for displaying point clouds with their boundaries.

convex#

Indicates whether to use a convex hull or an alpha shape.

Type:

bool

alpha#

The alpha parameter for the alpha shape. If None, an optimal alpha will be calculated.

Type:

Optional[float]

get_bound#

Dict[datetime, Trimesh]]]

A function that, given the alpha parameter and a boolean indicating whether to use a convex hull, returns the boundary flags and hulls for each timestep.

Type:

Callable[[float, bool], Tuple[Dict[datetime, np.ndarray],

is_bound#

A dictionary mapping timesteps to arrays indicating whether each point is within the boundary.

Type:

Dict[datetime, np.ndarray]

hulls#

A dictionary mapping timesteps to Trimesh objects representing the hulls.

Type:

Dict[datetime, Trimesh]

Parameters:
class GeoCS.plot_lib.CSVisualizer(x, y, z, is_bound, eps, N_cs, get_E, get_clust, initial_time_index=0)#

Bases: PointCloudVisualizer

A visualizer for displaying coherent sets.

N_cs#

The number of coherent sets (clusters) to identify.

Type:

int

eps#

The epsilon parameter used in the diffusion map calculation.

Type:

float

get_E#

np.typing.NDArray[float]]]

A function to compute or retrieve the eigenvalues and eigenvectors.

Type:

Callable[[float], Tuple[np.typing.NDArray[float],

get_clust#

A function to perform clustering on the eigenvectors.

Type:

Callable[[int], np.typing.NDArray[int]]

cluster_labels#

The labels of each point indicating its cluster assignment.

Type:

np.typing.NDArray[int]

Parameters:
class GeoCS.plot_lib.PointCloudVisualizer(x, y, z, initial_time_index=0)#

Bases: ABC

Abstract base class interactive point cloud visualizers.

x#

The x-coordinates of the points in the point cloud.

Type:

np.ndarray

y#

The y-coordinates of the points in the point cloud.

Type:

np.ndarray

z#

The z-coordinates of the points in the point cloud.

Type:

np.ndarray

t_i#

The initial time index for the visualization.

Type:

int

fig#

The figure object for the plot.

Type:

plt.Figure

ax#

The axes object for the plot.

Type:

plt.Axes

t_slider#

An interactive slider widget to control the time dimension.

Type:

Slider

Parameters:
  • x (ndarray) –

  • y (ndarray) –

  • z (ndarray) –

  • initial_time_index (int) –

GeoCS.plot_lib.plot_dist_hist(hist_counts, bin_edges, **kwargs)#

Plot a heatmap of histogram counts over timesteps.

Parameters:
  • hist_counts (dict) – A dictionary with timesteps as keys and

  • values. (histogram counts as) –

  • bin_edges (np.ndarray) – The edges of the bins used for the histograms.

  • **kwargs – Additional keyword arguments to customize the plot: - cmap (str): Colormap for the heatmap. Default is “viridis”. - figsize (Tuple[int, int]): Figure size. Default is (10, 6).

Return type:

Tuple[Figure, Axes]

Returns:

Tuple[matplotlib.figure.Figure, matplotlib.axes._axes.Axes]:

The figure and axes objects of the plot.

GeoCS.plot_lib.plot_traj_2d(trajs, projection, extent, **kwargs)#

Plot a 2D trajectory map with specified projection and extent.

Parameters:
  • trajs (numpy.ndarray) – A structured array containing ‘lon’, ‘lat’, and ‘p’ fields.

  • projection (cartopy.crs.Projection) – The cartopy coordinate reference system to use for the plot.

  • extent (List[float]) – A list of floats specifying the extent of the plot as [longitude_min, longitude_max, latitude_min, latitude_max].

  • **kwargs (dict, optional) –

    Additional keyword arguments: - cmap (matplotlib.colors.Colormap): The colormap for the line plot.

    Default is a custom cmap.

    • norm (matplotlib.colors.Normalize): The normalization for the line plot.

    • figsize (tuple): Figure size as (width, height). Default is (3.5, 2).

    • every_n (int): Frequency of trajectories to plot. Default is 50.

    • linewidth (float): Width of the trajectory lines. Default is 0.4.

    • points (list): Indices of points to select for the scatter plot. Default is [0, -1] for the first and last points.

    • s (float): Size of the scatter plot markers. Default is 0.4.

Return type:

Tuple[Figure, GeoAxes]

Returns:

  • fig (matplotlib.figure.Figure) – The matplotlib figure object.

  • ax (cartopy.mpl.geoaxes.GeoAxes) – The cartopy GeoAxes object.

Module contents#