Metadata-Version: 2.4
Name: cssd
Version: 1.0.1
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Rust
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Dist: numpy>=1.23
Requires-Dist: scipy>=1.10
Requires-Dist: matplotlib>=3.6 ; extra == 'demos'
Requires-Dist: pandas>=1.5 ; extra == 'demos'
Requires-Dist: maturin>=1.7 ; extra == 'dev'
Requires-Dist: pytest>=7 ; extra == 'dev'
Requires-Dist: matplotlib>=3.6 ; extra == 'dev'
Requires-Dist: pytest>=7 ; extra == 'test'
Requires-Dist: scipy>=1.10 ; extra == 'test'
Provides-Extra: demos
Provides-Extra: dev
Provides-Extra: test
License-File: LICENSE
Summary: Cubic smoothing splines for discontinuous signals (CSSD)
Keywords: smoothing-spline,discontinuities,signal-processing,edge-preserving,variational-methods,piecewise
Author-email: Martin Storath <martin.storath@thws.de>, Andreas Weinmann <andreas.weinmann@thws.de>
Requires-Python: >=3.9
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Bug Tracker, https://github.com/mstorath/CSSD/issues
Project-URL: Changelog, https://github.com/mstorath/CSSD/blob/main/CHANGELOG.md
Project-URL: Homepage, https://github.com/mstorath/CSSD
Project-URL: Repository, https://github.com/mstorath/CSSD

# CSSD - Cubic smoothing splines for discontinuous signals

This is a reference implementation in Matlab for the algorithms described in the paper

**M. Storath, A. Weinmann, "Smoothing splines for discontinuous signals", Journal of Computational and Graphical Statistics, 2023, [[Preprint](
https://doi.org/10.48550/arXiv.2211.12785)]**

## Overview of main functionalities
1. **cssd.m** computes a cubic smoothing spline with discontinuities (CSSD) for data (x,y). It is a solution of the following model of a smoothing spline $f$ with a-priori unknown discontinuities $J$

  $$\min_{f, J} p \sum_{i=1}^N \left(\frac{y_i - f(x_i)}{\delta_i}\right)^2   +  (1-p) \int_{[x_1, x_N] \setminus J}   (f''(t))^2 dt	 + \gamma |J|.$$

  where
  *  $y_i = g(x_i) + \epsilon_i$ are samples of piecewise smooth function $g$ at data sites $x_1, \ldots, x_N$, and an estimate $\delta_i$  of the standard deviation of the errors $\epsilon_i$
  * the minimum is taken over all possible sets of discontinuities between two data sites $J \subset [x_1, x_N]\setminus \{x_1, \ldots, x_N\}$
   and all functions $f$ that are twice continuously differentiable away from the discontinuities.
  * The model parameter $p \in (0, 1)$ controls the relative weight of the smoothness term (second term) and the data fidelity term.
  * The last term is a penalty for the number of discontinuities $|J|$ weighted by a parameter $\gamma > 0.$ 

2. **cssd_cv.m** automatically determines values for the model parameters $p$ and $\gamma$ based on K-fold cross validation.

## Quickstart

### Python (Rust core)

```bash
pip install cssd
```

```python
import numpy as np
from cssd import cssd, cssd_cv

x = np.linspace(0, 1, 100)
y = np.sin(4 * np.pi * x) - np.sign(x - 0.3) - np.sign(0.72 - x)

out = cssd(x, y, p=0.999, gamma=8.0)
out.discont      # detected jump locations
out.pp(x)        # evaluate the piecewise spline

cv = cssd_cv(x, y, cv_type="random", cv_arg=5)
cv.p, cv.gamma, cv.fit.discont
```

The Python package wraps a Rust extension built with [PyO3](https://pyo3.rs)
and [maturin](https://maturin.rs); see `crates/cssd-core` for the algorithm
crate and `crates/cssd-py` for the bindings.

### MATLAB (reference implementation, unchanged)
1. Execute "install_cssd.m" which adds the folder and all subfolders to the Matlab path.
2. Execute any m-file from the demos folder

## Examples

### Synthetic data
<figure>
  <img src="images/Ex_Synthetic.png" alt="Synthetic signal" width="600"/>
  <figcaption>A synthetic signal is sampled at $N = 100$ random data sites $x_i$ 
	and corrupted by zero mean Gaussian noise with standard deviation 
$0.1.$
	The results of the discussed model are shown for $p=0.999$ and different parameters of $\gamma,$ where $\gamma=\infty$  corresponds to classical smoothing splines.
	The thick lines 
represent the results of the shown sample realization.	 The shaded areas depict the $2.5 \%$ to $97.5 \%$  (pointwise) quantiles of $1000$ realizations. The histograms under the plots show the frequency of the detected discontinuity locations over all realizations.
</figcaption>
</figure>

### Stock data
<figure>
  <img src="images/Ex_Stock_CV.png" alt="Stock" width="600"/>
  <figcaption>The dots represent the logarithm of the closing prices of the Meta stock from May 18, 2012, 
	to May 19, 2022. The curve represents the CSSD with parameters determined by K-fold CV ($p = 0.4702$,  $\gamma = 0.0069$). The dashed vertical lines indicate the discontinuities of the CSSD, and the ticks correspond to the date before the discontinuity.
</figcaption>
</figure>

### Geyser data
<figure>
  <img src="images/Ex_Geyser_CV.png" alt="Geyser" width="600"/>
  <figcaption>Fitting a CSSD to the Old Faithful data (circles):
	If the parameter is selected based on K-fold CV
	we obtain a result without discontinuities which coincides with a classical smoothing spline (solid curve).
	Keeping the selected $p$-parameter and lowering the $\gamma$ parameter sufficiently gives a two-phase regression curve (dashed curves) with a breakpoint near $x = 3$ (dashed vertical line), and the two curve segments are nearly linear. 
	Both of the above parameter sets yield better CV-scores  than a linear model (dotted line).
</figcaption>
</figure>

## Reference

M. Storath, A. Weinmann, "Smoothing splines for discontinuous signals", Journal of Computational and Graphical Statistics, 2023

## See also 

- [Higher order Mumford-Shah models (discrete splines with discontinuities)](https://github.com/lu-kie/HOMS_SignalProcessing)
- [Degrees of freedom penalized regression](https://github.com/SV-97/pcw-regrs)
- [Pottslab](https://github.com/mstorath/Pottslab)
- [L1TV regularization](https://github.com/mstorath/L1TV)

