Coverage for C: \ Users \ peaco \ OneDrive \ Documents \ GitHub \ mt_metadata \ mt_metadata \ features \ coherence.py: 91%

64 statements  

« prev     ^ index     » next       coverage.py v7.13.1, created at 2026-01-10 00:11 -0800

1"""This module contains the simplest coherence feature. 

2The feature is computed with scipy.signal coherence. 

3 

4Note that this coherence is one number for the entire time-series (per frequency), i.e. 

5 

6The Window object is used to taper the time series before FFT. 

7 

8Development Notes: 

9Coherence extends the Feature class. This means that it should have 

10all the attrs that a Feature instance does, as well as its own unique ones. 

11When setting up the attr_dict, one is confronted with the question of adding 

12BaseFeatures attrs one of two ways: 

13- To add the features directly, use: 

14attr_dict.add_dict(get_schema("base_feature", SCHEMA_FN_PATHS)) 

15{ 

16 "coherence": { 

17 "ch1": "ex", 

18 "ch2": "hy", 

19 "description": "Simple coherence between two channels derived directly from scipy.signal.coherence applied to time domain data", 

20 "domain": "frequency", 

21 "name": "coherence", 

22 "window.clock_zero_type": "ignore", 

23 "window.normalized": true, 

24 "window.num_samples": 512, 

25 "window.overlap": 128, 

26 "window.type": "hamming" 

27 } 

28} 

29- To nest the features use: 

30attr_dict.add_dict(BaseFeature()._attr_dict, "base_feature") 

31{ 

32 "coherence": { 

33 "base_feature.description": null, 

34 "base_feature.domain": null, 

35 "base_feature.name": null, 

36 "ch1": "ex", 

37 "ch2": "hy", 

38 "window.clock_zero_type": "ignore", 

39 "window.normalized": true, 

40 "window.num_samples": 512, 

41 "window.overlap": 128, 

42 "window.type": "hamming" 

43 } 

44} 

45 

46Devlopment Notes: 

47 To specify a channel in the context of tf processing we need station and channel names. 

48 I have been fighting the use of `rx` and `ry` for several reasons, including that the [ex, ey, hx, hy, hz, rx, ry] 

49 convention forces the assumption that remote channels are remote magnetics, and are overly specific to the remote 

50 reference processing convention. 

51 Hoever, for a feature like this, it could seem to be a hassle to update the processing config with the station name all over the 

52 feature definations. So, it seems that we should have a station field, and a channel field. 

53 If the user wishes to specify station and channel, fine. If the user prefers the more general, 

54 but less well defined [ex, ey, hx, hy, hz, rx, ry] nomenclature, then we can ddeduce this for them. 

55 

56Development Note (2025-05-24): 

57 Note that the simple coherence as computed here, just returns one number per frequency. 

58 It is the average coherence over the entire run, and is not innately a "per-time-window feature". 

59 

60 To make it a per-time-window feature, we need to apply the transform on individual windows (not the whole run). 

61 i.e. chunk a run into sub-windows, and then compute coherence on each of those individually. To accomplish this 

62 we must shorten the window.num_samples to be smaller than the sub-window size, otherwise, coherence 

63 degenerates to 1 everywhere. (Recall coherenc is th average cross-power over average sqrt auto-powers, and having 

64 only one spectral estimate means there is no averaging). 

65 Selection of an appropriate "window-within-the-sub-window" for spectral esimation comes with some caveats; 

66 

67 The length of the window-within-the-window must be small enough to get at least a few) 

68 spectral estimates, meaning that the frequency content will not mirror that of the FFT 

69 That said, we can know our lowest frequency of TF estimation (usually no fewer than 5 cycles), 

70 so we could set the window-within-window width to be, say 1/5 the FFT window length, and then we'll get 

71 something we can use, although it will be somwhat unrealiable at long period (but so is everything else:/). 

72 Note that when we are using long FFT windows (such as for HF data processing) this is not such a concern 

73 

74 Way Forward: A "StridingWindowCoherence" (effectively a spectrogram of coherence) can be an extension of the 

75 Cohernece feature. It will have the same properties, but will also have a "SubWindow". The SubWindow will be 

76 another window function object, but it can be parameterized, for example, as a fraction of the 

77 "Spectrogram Sliding Window". 

78 

79 The compute function could possibly be done by computing Coherence on each sub-window (kinda elegant 

80 but may wind up being a bit slow with all the for-looping) 

81""" 

82 

83# ===================================================== 

84# Imports 

85# ===================================================== 

86from typing import Annotated, Optional, Tuple 

87 

88import numpy as np 

89import scipy.signal as ssig 

90from loguru import logger 

91from pydantic import computed_field, Field, model_validator 

92 

93from mt_metadata.common.enumerations import StrEnumerationBase 

94from mt_metadata.features.feature import Feature 

95from mt_metadata.processing.window import Window 

96 

97 

98# ===================================================== 

99class DetrendEnum(StrEnumerationBase): 

100 linear = "linear" 

101 constant = "constant" 

102 

103 

104class Coherence(Feature): 

105 channel_1: Annotated[ 

106 str, 

107 Field( 

108 default="", 

109 description="The first channel of two channels in the coherence calculation.", 

110 alias=None, 

111 json_schema_extra={ 

112 "units": None, 

113 "required": True, 

114 "examples": ["ex"], 

115 }, 

116 ), 

117 ] 

118 

119 channel_2: Annotated[ 

120 str, 

121 Field( 

122 default="", 

123 description="The second channel of two channels in the coherence calculation.", 

124 alias=None, 

125 json_schema_extra={ 

126 "units": None, 

127 "required": True, 

128 "examples": ["hy"], 

129 }, 

130 ), 

131 ] 

132 

133 detrend: Annotated[ 

134 DetrendEnum, 

135 Field( 

136 default=DetrendEnum.linear, 

137 description="How to detrend the data segments before fft.", 

138 alias=None, 

139 json_schema_extra={ 

140 "units": None, 

141 "required": True, 

142 "examples": ["constant"], 

143 }, 

144 ), 

145 ] 

146 

147 station_1: Annotated[ 

148 str | None, 

149 Field( 

150 default=None, 

151 description="The station associated with the first channel in the coherence calculation.", 

152 alias=None, 

153 json_schema_extra={ 

154 "units": None, 

155 "required": True, 

156 "examples": ["PKD"], 

157 }, 

158 ), 

159 ] 

160 

161 station_2: Annotated[ 

162 str | None, 

163 Field( 

164 default=None, 

165 description="The station associated with the second channel in the coherence calculation.", 

166 alias=None, 

167 json_schema_extra={ 

168 "units": None, 

169 "required": True, 

170 "examples": ["SAO"], 

171 }, 

172 ), 

173 ] 

174 

175 window: Annotated[ 

176 Window, 

177 Field( 

178 default=Window(num_samples=256, overlap=128, type="hamming"), # type: ignore 

179 description="The window function to apply to the data segments before fft.", 

180 alias=None, 

181 json_schema_extra={ 

182 "units": None, 

183 "required": True, 

184 "examples": [{"type": "hamming", "num_samples": 256, "overlap": 128}], 

185 }, 

186 ), 

187 ] 

188 

189 @model_validator(mode="before") 

190 @classmethod 

191 def set_defaults(cls, data: dict) -> dict: 

192 data["name"] = "coherence" 

193 data["domain"] = "frequency" 

194 data["description"] = ( 

195 "Simple coherence between two channels derived " 

196 "directly from scipy.signal.coherence applied to " 

197 "time domain data" 

198 ) 

199 return data 

200 

201 @computed_field 

202 @property 

203 def channel_pair_str(self) -> str: 

204 return f"{self.channel_1}, {self.channel_2}" 

205 

206 def validate_station_ids( 

207 self, local_station_id: str, remote_station_id: Optional[str] = None 

208 ) -> None: 

209 """ 

210 Make sure that ch1, ch2 are unambiguous. 

211 

212 Ideally the station for each channel is specified, but if not, 

213 try deducing the channel. 

214 

215 Parameters 

216 ---------- 

217 local_station_id: str 

218 The name of the local station for a TF calculation 

219 remote_station_id: Optional[str] 

220 The name of the remote station for a TF calculation 

221 

222 """ 

223 

224 # validate the station names: 

225 active_stations = [local_station_id] 

226 if remote_station_id: 

227 active_stations.append(remote_station_id) 

228 

229 # if the feature has a station_1, check that it is in the list of active stations 

230 if self.station_1: # not "" or None 

231 if self.station_1 not in active_stations: 

232 msg = "station_1 not in expected stations -- setting to None" 

233 logger.warning(msg) 

234 self.station_1 = None 

235 

236 if self.station_2: # not "" or None 

237 if self.station_2 not in active_stations: 

238 msg = "station_2 not in expected stations -- setting to None" 

239 logger.warning(msg) 

240 self.station_2 = None 

241 

242 if not self.station_1: 

243 if self.channel_1.lower().startswith("r"): 

244 self.station_1 = remote_station_id 

245 else: 

246 self.station_1 = local_station_id 

247 

248 if not self.station_2: 

249 if self.channel_2.lower().startswith("r"): 

250 self.station_2 = remote_station_id 

251 else: 

252 self.station_2 = local_station_id 

253 

254 # by this time, all stations should be set. Confirm that we do not have a station that is None 

255 # TODO Consider returning False if exception encountered here. 

256 try: 

257 assert self.station_1 is not None 

258 except Exception as e: 

259 msg = "station_1 is not set -- perhaps it was set to a remote that does not exist?" 

260 logger.error(msg) 

261 try: 

262 assert self.station_2 is not None 

263 except Exception as e: 

264 msg = "station_2 is not set -- perhaps it was set to a remote that does not exist?" 

265 logger.error(msg) 

266 

267 def compute( 

268 self, ts_1: np.ndarray, ts_2: np.ndarray 

269 ) -> Tuple[np.ndarray, np.ndarray]: 

270 """ 

271 Calls scipy's coherence function. 

272 TODO: Consider making this return an xarray indexed by frequency. 

273 

274 Parameters 

275 ---------- 

276 ts_1 

277 ts_2 

278 

279 Returns 

280 ------- 

281 

282 """ 

283 frequencies, coh_squared = ssig.coherence( 

284 ts_1, 

285 ts_2, 

286 window=self.window.type, 

287 nperseg=self.window.num_samples, 

288 noverlap=self.window.overlap, 

289 detrend=self.detrend, 

290 ) 

291 return frequencies, coh_squared