silly-kicks
Copyright (c) 2019 KU Leuven Machine Learning Research Group (Tom Decroos, Pieter Robberechts)
Copyright (c) 2026 Karsten S. Nielsen

This product is a maintained fork of socceraction
(https://github.com/ML-KULeuven/socceraction). Major architectural changes
since 1.0.0 are documented in CHANGELOG.md.

Third-Party Libraries
---------------------

kloppy --- standardizing soccer tracking/event data (BSD-3-Clause License).
Copyright (c) kloppy contributors.
See: https://github.com/PySport/kloppy

pandas, numpy, scikit-learn --- core dependencies (BSD / standard licenses).

Optional gradient-boosting backends (xgboost, lightgbm, catboost) --- listed
in pyproject.toml; each retains its upstream license.

Mathematical / Methodological References
----------------------------------------

The SPADL action representation (silly_kicks/spadl/) implements the framework
described in: Decroos, T., Van Haaren, J., & Davis, J. (2018). "SPADL: A
Common Framework for Action Description in Soccer." Workshop on Machine
Learning and Data Mining for Sports Analytics (ECML-PKDD).

The VAEP action valuation framework (silly_kicks/vaep/) implements:
Decroos, T., Bransen, L., Van Haaren, J., & Davis, J. (2019). "Actions
Speak Louder Than Goals: Valuing Player Actions in Soccer." Proc. KDD '19.
The HybridVAEP variant (silly_kicks/vaep/hybrid.py) is a result-leakage-
removal variant introduced in this fork; no separate academic citation.

The Atomic-SPADL representation and Atomic-VAEP framework
(silly_kicks/atomic/) implement: Decroos, T., Robberechts, P., & Davis, J.
(2020). "Introducing Atomic-SPADL: A New Way to Represent Event Stream Data."
DTAI Sports Analytics Blog.

The Expected Threat (xT) grid (silly_kicks/xthreat.py) seeds from:
Singh, K. (2018). "Introducing Expected Threat (xT)." karun.in/blog/expected-threat
The grid is recomputable from event data; the seed values are reference-only.

The tracking namespace primitive layer (silly_kicks/tracking/, PR-S19,
ADR-004) implements ingestion + linkage primitives across PFF, Sportec,
Metrica, and SkillCorner. No new academic methodology beyond the canonical
ADR.

The four tracking-aware action-context features in
silly_kicks/tracking/features.py and silly_kicks/atomic/tracking/features.py
(PR-S20, ADR-005) implement methodologies described in:

- Lucey, P., Bialkowski, A., Monfort, M., Carr, P., & Matthews, I. (2014).
  "Quality vs Quantity: Improved Shot Prediction in Soccer using Strategic
  Features from Spatiotemporal Data." MIT Sloan Sports Analytics Conference.
  (canonical "defenders in shot triangle" feature; nearest-defender-distance
  for shots)

- Anzer, G., & Bauer, P. (2021). "A goal scoring probability model for shots
  based on synchronized positional and event data in football and futsal."
  Frontiers in Sports and Active Living, 3, 624475.
  (player_speed, distance-to-defender, and defending-GK-position as xG features)

- Spearman, W. (2018). "Beyond Expected Goals." MIT Sloan Sports Analytics
  Conference.
  (zone-based defender intensity in pitch-control framework)

- Spearman, W., Basye, A., Dick, G., Hotovy, R., & Pop, P. (2017).
  "Physics-Based Modeling of Pass Probabilities in Soccer." MIT Sloan Sports
  Analytics Conference.
  (kinematic time-to-intercept pitch control model; acceleration-based TTI +
  logistic influence + ratio aggregation. Implemented in
  silly_kicks/tracking/pitch_control/_spearman.py (PR-S31, TF-7).)

- Fernandez, J., & Bornn, L. (2018). "Wide Open Spaces: A statistical technique
  for measuring space generation in professional soccer." MIT Sloan Sports
  Analytics Conference.
  (bivariate-normal pitch control model; velocity-scaled anisotropic Gaussian
  influence fields + sigmoid team aggregation. Implemented in
  silly_kicks/tracking/pitch_control/_fernandez_bornn.py (PR-S31, TF-7).)

- Shaw, L., & Sudarshan, M. (2020). "A Framework for Tactical Analysis and
  Individual Offensive Production Assessment in Soccer Using Markov Models."
  (xT and pitch control integration; ball-travel-time filter concept.
  Informs the ball_position conditioning in compute_pitch_control dispatch.)

- Power, P., Ruiz, H., Wei, X., & Lucey, P. (2017). "Not all passes are
  created equal: Objectively measuring the risk and reward of passes in
  soccer from tracking data." KDD '17 (OBSO).
  (receiver-zone risk/reward modelling)

- Pollard, R., & Reep, C. (1997). "Measuring the effectiveness of playing
  strategies at soccer." Journal of the Royal Statistical Society Series D,
  46(4), 541-550.
  (early shot-quality / pressure-from-defenders concept)

- Savitzky, A., & Golay, M. J. E. (1964). "Smoothing and Differentiation of
  Data by Simplified Least Squares Procedures." Analytical Chemistry, 36(8),
  1627-1639.
  (Savitzky-Golay polynomial smoothing + analytical derivative -- used for
  position smoothing and velocity derivation in
  silly_kicks.tracking.preprocess (PR-S24, ADR-004 invariants 6/7).
  PR-S24 also uses Anzer & Bauer (2021) above for the
  pre_shot_gk_angle_to_shot_trajectory and pre_shot_gk_angle_off_goal_line
  features in silly_kicks/tracking/features.py and the atomic mirror.)

- Andrienko, G., Andrienko, N., Budziak, G., Dykes, J., Fuchs, G.,
  von Landesberger, T., & Weber, H. (2017). "Visual analysis of pressure in
  football." Data Mining and Knowledge Discovery, 31, 1793-1839.
  Used by: silly_kicks.tracking.features.pressure_on_actor (method="andrienko_oval").
  Numerical defaults from section 3.1: D_front=9 m, D_back=3 m, q=1.75.

- Link, D., Lang, S., & Seidenschwarz, P. (2016). "Real Time Quantification of
  Dangerousity in Football Using Spatiotemporal Tracking Data." PLOS ONE,
  11(12): e0168768.
  Used by: silly_kicks.tracking.features.pressure_on_actor (method="link_zones").
  Zone radii (HOZ=4, LZ=3, HZ=2 m) and angular boundaries (45 deg, 90 deg) from
  Figure 2. The paper additionally labels a 1 m "High Pressure Zone (HPZ)"
  inner arc with prose-described "constant high pressure", but Eq (3) of
  the paper does not special-case it -- silly-kicks honors Eq (3) as the
  formal specification (Plan A: equation-faithful, no discontinuity).
  Saturation constant k3 not published in the paper; silly-kicks default
  k3=1.0 is an engineering choice exposed as a kwarg. Calibration deferred
  post-release to Optuna sweep (silly-kicks TODO TF-24).

- Bekkers, J. (2025). "Pressing Intensity: An Intuitive Measure for Pressing
  in Soccer." arXiv:2501.04712.
  Used by: silly_kicks.tracking.features.pressure_on_actor (method="bekkers_pi").
  Time-to-intercept formula extends Spearman 2017 / Shaw / Pleuler with
  velocity-direction penalty. Defaults from paper + canonical implementation
  (UnravelSports/unravelsports, BSD-3-Clause).

- Herold, M., Goes, F., Nopp, S., Bauer, P., Thompson, C., & Meyer, T.
  (2022). "Machine learning-based analysis of match performance indicators
  for classifying match outcomes in professional football." arXiv:2511.06191.
  Used by: silly_kicks.tracking._defensive_line.compute_defensive_line (TF-14).
  Defensive-line height, compactness, and lateral spread as match-outcome
  discriminators. Default N=4.

- Forcher, L., Forcher, L., Altmann, S., Stein, T., Biermann, H., Dutt, M.,
  & Memmert, D. (2022). "How to defend against the pass into the box? Using
  explainable machine learning to identify defensive strategies." KDD Workshop
  on Data Science for Sports Analytics. arXiv:2511.00121.
  Used by: silly_kicks.tracking._defensive_line (TF-14).
  Back-line shape as defensive feature for pass-into-box models.

- FIFA (2022). "Enhanced Football Intelligence: Physical Report —
  FIFA World Cup Qatar 2022."
  Used by: silly_kicks.tracking._defensive_line (TF-14).
  Practitioner documentation of 4-back defensive-line metrics at tournament
  level. Validates the N=4 default and line-height / compactness operationalisation.

The off-ball-runs and line-break detection features in
silly_kicks/tracking/_off_ball_runs.py (PR-S30, TF-4) are novel
implementations inspired by:

- Spearman, W. (2018). "Beyond Expected Goals." MIT Sloan Sports Analytics
  Conference.
  (OBSO framework — Off-Ball Scoring Opportunity; off-ball-runs and
  line-break concepts.)

- Power, P., Ruiz, H., Wei, X., & Lucey, P. (2017). "Not all passes are
  created equal: Objectively measuring the risk and reward of passes in
  soccer from tracking data." KDD '17.
  (Contextual passing risk/reward; qualitatively mentions line-breaking
  passes inside formation clustering.)

- Bauer, P., & Anzer, G. (2021). "Data-driven detection of counterpressing in
  professional football." Data Mining and Knowledge Discovery, 35(5), 2009-2049.
  (Section 3 describes a velocity-toward-ball heuristic for carrier
  identification, used as input to their counterpressing classifier.
  Adapted for infer_ball_carrier primitive in silly_kicks.tracking.)

- Vidal-Codina, F., Evans, N., El Fakir, B., & Billingham, J. (2022).
  "Automatic Event Detection in Football Using Tracking Data."
  Sports Engineering, 25, 18.
  (Inertia/hysteresis recommendation for ball-possession algorithms;
  motivates the gamma hysteresis parameter in infer_ball_carrier.)

The implementations are independent Python translations of the published
methodologies, not derived from any source code. Licensed under the same
terms as silly-kicks (MIT License).

Third-Party Code Attribution
----------------------------

The Bekkers Pressing Intensity time-to-intercept formula
(silly_kicks.tracking._kernels._bekkers_tti) is a re-implementation of
the canonical Python source published under the BSD 3-Clause License by
Joris Bekkers / UnravelSports:

    https://github.com/UnravelSports/unravelsports
    unravel/soccer/models/utils.py -- time_to_intercept()

Required attribution per BSD-3-Clause:

    Copyright (c) 2025 UnravelSports
    All rights reserved.

    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
    are met:
    (1) Redistributions of source code must retain the above copyright
        notice, this list of conditions and the following disclaimer.
    (2) Redistributions in binary form must reproduce the above copyright
        notice, this list of conditions and the following disclaimer in
        the documentation and/or other materials provided with the
        distribution.
    (3) Neither the name of the copyright holder nor the names of its
        contributors may be used to endorse or promote products derived
        from this software without specific prior written permission.

    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
    FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
    COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
    INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
    BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
    LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
    CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
    LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
    ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
    POSSIBILITY OF SUCH DAMAGE.

silly-kicks re-implements the algorithm with attribution; any modifications
to numerical constants, parameter handling, or aggregation logic are
documented in the source module docstring of silly_kicks.tracking._kernels.

A 30-line excerpt of the canonical source (``time_to_intercept`` +
``probability_to_intercept``) is also vendored at
``tests/_vendored/unravelsports_tti.py`` with the BSD-3-Clause license header
preserved verbatim. This excerpt is used solely by the golden-master parity
test (``tests/tracking/test_pressure_bekkers_golden_master.py``) on Python
versions where the live ``unravelsports`` package cannot be installed
(``unravelsports>=1.2`` requires Python 3.11+; silly-kicks targets >=3.10).
The vendored copy is test-only -- it is NOT exposed in the silly-kicks
runtime distribution and is not consumed by any silly-kicks public API.


Test Data Sources
-----------------

Test fixtures under tests/datasets/ are excluded from the published
silly-kicks wheel via the [tool.hatch.build.targets.wheel] packages
config in pyproject.toml. They exist solely to exercise the converter
pipelines in CI.

The IDSSE per-period orientation fixture
(tests/datasets/idsse/per_period_match.parquet, PR-S23 / silly-kicks
3.0.1) is derived from the DFL Bundesliga Match Tracking Data -- Open
Match Data 2022/23, published by Bassek et al. (2025) under CC-BY 4.0.
Match identifier: idsse_J03WMX (public DFL competition identifier --
no PII). Citation: Bassek, M., Skinner, J., Niemann, J., et al. (2025).
"An Open Bundesliga Match Tracking Dataset." DFL DataHub.

The Metrica per-period orientation fixture
(tests/datasets/metrica/per_period_match.parquet, PR-S23 / silly-kicks
3.0.1) is Sample Game 1 from
https://github.com/metrica-sports/sample-data, published under
CC-BY-NC-4.0 (same license as Sample Game 2 used by sample_match.parquet).
Coordinates rescaled at extraction time from Metrica's native 0-1
normalised frame to the silly-kicks-input 0-105 / 0-68 frame.

The existing IDSSE contract-test fixture
(tests/datasets/idsse/sample_match.parquet) is from DFL DataHub
free-sample data (non-commercial redistribution permitted). The
existing Metrica contract-test fixture
(tests/datasets/metrica/sample_match.parquet) is Sample Game 2 from
metrica-sports/sample-data (CC-BY-NC-4.0).

The kloppy test fixtures (tests/datasets/kloppy/*.xml + .json) are
vendored from kloppy under BSD-3-Clause. The metrica_events.json
within is originally from metrica-sports/sample-data Sample Game 2
(CC-BY-NC-4.0).
