Coverage for C: \ Users \ peaco \ OneDrive \ Documents \ GitHub \ mt_metadata \ mt_metadata \ features \ coherence.py: 91%
64 statements
« prev ^ index » next coverage.py v7.13.1, created at 2026-01-10 00:11 -0800
« prev ^ index » next coverage.py v7.13.1, created at 2026-01-10 00:11 -0800
1"""This module contains the simplest coherence feature.
2The feature is computed with scipy.signal coherence.
4Note that this coherence is one number for the entire time-series (per frequency), i.e.
6The Window object is used to taper the time series before FFT.
8Development Notes:
9Coherence extends the Feature class. This means that it should have
10all the attrs that a Feature instance does, as well as its own unique ones.
11When setting up the attr_dict, one is confronted with the question of adding
12BaseFeatures attrs one of two ways:
13- To add the features directly, use:
14attr_dict.add_dict(get_schema("base_feature", SCHEMA_FN_PATHS))
15{
16 "coherence": {
17 "ch1": "ex",
18 "ch2": "hy",
19 "description": "Simple coherence between two channels derived directly from scipy.signal.coherence applied to time domain data",
20 "domain": "frequency",
21 "name": "coherence",
22 "window.clock_zero_type": "ignore",
23 "window.normalized": true,
24 "window.num_samples": 512,
25 "window.overlap": 128,
26 "window.type": "hamming"
27 }
28}
29- To nest the features use:
30attr_dict.add_dict(BaseFeature()._attr_dict, "base_feature")
31{
32 "coherence": {
33 "base_feature.description": null,
34 "base_feature.domain": null,
35 "base_feature.name": null,
36 "ch1": "ex",
37 "ch2": "hy",
38 "window.clock_zero_type": "ignore",
39 "window.normalized": true,
40 "window.num_samples": 512,
41 "window.overlap": 128,
42 "window.type": "hamming"
43 }
44}
46Devlopment Notes:
47 To specify a channel in the context of tf processing we need station and channel names.
48 I have been fighting the use of `rx` and `ry` for several reasons, including that the [ex, ey, hx, hy, hz, rx, ry]
49 convention forces the assumption that remote channels are remote magnetics, and are overly specific to the remote
50 reference processing convention.
51 Hoever, for a feature like this, it could seem to be a hassle to update the processing config with the station name all over the
52 feature definations. So, it seems that we should have a station field, and a channel field.
53 If the user wishes to specify station and channel, fine. If the user prefers the more general,
54 but less well defined [ex, ey, hx, hy, hz, rx, ry] nomenclature, then we can ddeduce this for them.
56Development Note (2025-05-24):
57 Note that the simple coherence as computed here, just returns one number per frequency.
58 It is the average coherence over the entire run, and is not innately a "per-time-window feature".
60 To make it a per-time-window feature, we need to apply the transform on individual windows (not the whole run).
61 i.e. chunk a run into sub-windows, and then compute coherence on each of those individually. To accomplish this
62 we must shorten the window.num_samples to be smaller than the sub-window size, otherwise, coherence
63 degenerates to 1 everywhere. (Recall coherenc is th average cross-power over average sqrt auto-powers, and having
64 only one spectral estimate means there is no averaging).
65 Selection of an appropriate "window-within-the-sub-window" for spectral esimation comes with some caveats;
67 The length of the window-within-the-window must be small enough to get at least a few)
68 spectral estimates, meaning that the frequency content will not mirror that of the FFT
69 That said, we can know our lowest frequency of TF estimation (usually no fewer than 5 cycles),
70 so we could set the window-within-window width to be, say 1/5 the FFT window length, and then we'll get
71 something we can use, although it will be somwhat unrealiable at long period (but so is everything else:/).
72 Note that when we are using long FFT windows (such as for HF data processing) this is not such a concern
74 Way Forward: A "StridingWindowCoherence" (effectively a spectrogram of coherence) can be an extension of the
75 Cohernece feature. It will have the same properties, but will also have a "SubWindow". The SubWindow will be
76 another window function object, but it can be parameterized, for example, as a fraction of the
77 "Spectrogram Sliding Window".
79 The compute function could possibly be done by computing Coherence on each sub-window (kinda elegant
80 but may wind up being a bit slow with all the for-looping)
81"""
83# =====================================================
84# Imports
85# =====================================================
86from typing import Annotated, Optional, Tuple
88import numpy as np
89import scipy.signal as ssig
90from loguru import logger
91from pydantic import computed_field, Field, model_validator
93from mt_metadata.common.enumerations import StrEnumerationBase
94from mt_metadata.features.feature import Feature
95from mt_metadata.processing.window import Window
98# =====================================================
99class DetrendEnum(StrEnumerationBase):
100 linear = "linear"
101 constant = "constant"
104class Coherence(Feature):
105 channel_1: Annotated[
106 str,
107 Field(
108 default="",
109 description="The first channel of two channels in the coherence calculation.",
110 alias=None,
111 json_schema_extra={
112 "units": None,
113 "required": True,
114 "examples": ["ex"],
115 },
116 ),
117 ]
119 channel_2: Annotated[
120 str,
121 Field(
122 default="",
123 description="The second channel of two channels in the coherence calculation.",
124 alias=None,
125 json_schema_extra={
126 "units": None,
127 "required": True,
128 "examples": ["hy"],
129 },
130 ),
131 ]
133 detrend: Annotated[
134 DetrendEnum,
135 Field(
136 default=DetrendEnum.linear,
137 description="How to detrend the data segments before fft.",
138 alias=None,
139 json_schema_extra={
140 "units": None,
141 "required": True,
142 "examples": ["constant"],
143 },
144 ),
145 ]
147 station_1: Annotated[
148 str | None,
149 Field(
150 default=None,
151 description="The station associated with the first channel in the coherence calculation.",
152 alias=None,
153 json_schema_extra={
154 "units": None,
155 "required": True,
156 "examples": ["PKD"],
157 },
158 ),
159 ]
161 station_2: Annotated[
162 str | None,
163 Field(
164 default=None,
165 description="The station associated with the second channel in the coherence calculation.",
166 alias=None,
167 json_schema_extra={
168 "units": None,
169 "required": True,
170 "examples": ["SAO"],
171 },
172 ),
173 ]
175 window: Annotated[
176 Window,
177 Field(
178 default=Window(num_samples=256, overlap=128, type="hamming"), # type: ignore
179 description="The window function to apply to the data segments before fft.",
180 alias=None,
181 json_schema_extra={
182 "units": None,
183 "required": True,
184 "examples": [{"type": "hamming", "num_samples": 256, "overlap": 128}],
185 },
186 ),
187 ]
189 @model_validator(mode="before")
190 @classmethod
191 def set_defaults(cls, data: dict) -> dict:
192 data["name"] = "coherence"
193 data["domain"] = "frequency"
194 data["description"] = (
195 "Simple coherence between two channels derived "
196 "directly from scipy.signal.coherence applied to "
197 "time domain data"
198 )
199 return data
201 @computed_field
202 @property
203 def channel_pair_str(self) -> str:
204 return f"{self.channel_1}, {self.channel_2}"
206 def validate_station_ids(
207 self, local_station_id: str, remote_station_id: Optional[str] = None
208 ) -> None:
209 """
210 Make sure that ch1, ch2 are unambiguous.
212 Ideally the station for each channel is specified, but if not,
213 try deducing the channel.
215 Parameters
216 ----------
217 local_station_id: str
218 The name of the local station for a TF calculation
219 remote_station_id: Optional[str]
220 The name of the remote station for a TF calculation
222 """
224 # validate the station names:
225 active_stations = [local_station_id]
226 if remote_station_id:
227 active_stations.append(remote_station_id)
229 # if the feature has a station_1, check that it is in the list of active stations
230 if self.station_1: # not "" or None
231 if self.station_1 not in active_stations:
232 msg = "station_1 not in expected stations -- setting to None"
233 logger.warning(msg)
234 self.station_1 = None
236 if self.station_2: # not "" or None
237 if self.station_2 not in active_stations:
238 msg = "station_2 not in expected stations -- setting to None"
239 logger.warning(msg)
240 self.station_2 = None
242 if not self.station_1:
243 if self.channel_1.lower().startswith("r"):
244 self.station_1 = remote_station_id
245 else:
246 self.station_1 = local_station_id
248 if not self.station_2:
249 if self.channel_2.lower().startswith("r"):
250 self.station_2 = remote_station_id
251 else:
252 self.station_2 = local_station_id
254 # by this time, all stations should be set. Confirm that we do not have a station that is None
255 # TODO Consider returning False if exception encountered here.
256 try:
257 assert self.station_1 is not None
258 except Exception as e:
259 msg = "station_1 is not set -- perhaps it was set to a remote that does not exist?"
260 logger.error(msg)
261 try:
262 assert self.station_2 is not None
263 except Exception as e:
264 msg = "station_2 is not set -- perhaps it was set to a remote that does not exist?"
265 logger.error(msg)
267 def compute(
268 self, ts_1: np.ndarray, ts_2: np.ndarray
269 ) -> Tuple[np.ndarray, np.ndarray]:
270 """
271 Calls scipy's coherence function.
272 TODO: Consider making this return an xarray indexed by frequency.
274 Parameters
275 ----------
276 ts_1
277 ts_2
279 Returns
280 -------
282 """
283 frequencies, coh_squared = ssig.coherence(
284 ts_1,
285 ts_2,
286 window=self.window.type,
287 nperseg=self.window.num_samples,
288 noverlap=self.window.overlap,
289 detrend=self.detrend,
290 )
291 return frequencies, coh_squared