Metadata-Version: 2.4
Name: bullshit-detector
Version: 0.1.0
Summary: Statistical detection tools for screening published research
Author-email: Matteo Niccoli <matteo@mycarta.ca>
License: Apache-2.0
Project-URL: Homepage, https://github.com/mycarta/bullshit-detector
Project-URL: Documentation, https://github.com/mycarta/bullshit-detector/blob/main/README.md
Project-URL: Repository, https://github.com/mycarta/bullshit-detector
Project-URL: Issues, https://github.com/mycarta/bullshit-detector/issues
Keywords: statistics,research,reproducibility,p-hacking,spurious-correlation,GRIM,GRIMMER,peer-review
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: scipy>=1.10
Requires-Dist: numpy>=1.24
Requires-Dist: grim>=0.1.8
Requires-Dist: dcor>=0.6
Requires-Dist: requests>=2.28
Provides-Extra: full
Requires-Dist: statsmodels>=0.14; extra == "full"
Requires-Dist: pandas>=2.0; extra == "full"
Requires-Dist: seaborn>=0.12; extra == "full"
Requires-Dist: scikit-learn>=1.3; extra == "full"
Provides-Extra: batch
Requires-Dist: statcheck>=0.1; extra == "batch"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Dynamic: license-file

# bullshit-detector

Statistical detection tools for screening published research.

> "Bullshit is unavoidable whenever circumstances require someone to talk without knowing what they are talking about." — Harry Frankfurt, *On Bullshit* (2005)

## What this is

A Python toolkit for systematically screening research papers for statistical red flags. Organized in four tiers, from quick API lookups to deep data analysis. Developed from petroleum geoscience applications but applicable to any field where correlations, p-values, and sample sizes are reported.

| Tier | What it checks | What you need | Time |
|------|---------------|---------------|------|
| **0 — Paper screening** | Journal legitimacy, retractions, author credentials | DOI, journal name | Minutes |
| **1 — Arithmetic** | p-value consistency, GRIM/GRIMMER tests | Reported statistics | Minutes |
| **2 — Plausibility** | Spurious correlations, critical r, confidence intervals | Summary stats (r, n, k) | Minutes |
| **3 — Data analysis** | Outlier leverage, distance correlation, reproducibility | Raw/digitized data | Hours |

## Installation

```bash
pip install bullshit-detector          # Core (Tiers 0–2)
pip install bullshit-detector[full]    # + Tier 3 tools (statsmodels, seaborn)
pip install bullshit-detector[batch]   # + statcheck for PDF batch scanning (GPL-3.0)
pip install bullshit-detector[dev]     # + pytest for development
```

## Quick Start

### Is the reported p-value correct?
```python
from bullshit_detector.p_checker import check_p_value

check_p_value("t", 2.20, 28, reported_p=0.04)
# {'computed_p': 0.0362254847788378, 'reported_p': 0.04,
#  'consistent': True, 'decision_error': False, ...}
```

Reported p=0.04 is consistent with computed p=0.036 — within rounding tolerance.

### Could this correlation be spurious?
```python
from bullshit_detector.spurious import P_spurious

P_spurious(0.60, 5, 10)
# 0.9649622440458044
```

With 5 wells and 10 attributes, there's a 96.5% chance a correlation of r=0.60 is spurious.

### Has this paper been retracted?
```python
from bullshit_detector.paper_screening import check_retraction

check_retraction("10.2147/DMSO.S27665")
# {'retracted': True, 'corrections': [], 'pubpeer_comments': 0,
#  'pubpeer_url': 'https://pubpeer.com/publications/10.2147-DMSO.S27665'}
```

The green coffee extract paper (Vinson et al. 2012) was retracted in 2014.

## Intellectual foundations

This project stands on the shoulders of:

- **Carl T. Bergstrom & Jevin D. West** — *Calling Bullshit: The Art of Skepticism in a Data-Driven World* (Random House, 2020) and their [University of Washington course](https://callingbullshit.org/). The paper-level screening module (Tier 0) directly implements their legitimacy framework.
- **Harry Frankfurt** — *On Bullshit* (Princeton University Press, 2005). Established the philosophical foundation.
- **C.T. Kalkomey** — "Potential risks when using seismic attributes as predictors of reservoir properties" (*The Leading Edge*, 1997). The spurious correlation probability formula.
- **N.J.L. Brown & J.A.J. Heathers** — "The GRIM Test" (*SPPS*, 2017). Arithmetic consistency checking for means.
- **Aurélien Allard** — "Analytic-GRIMMER" (2018). Extended GRIM to standard deviations. The Python port in this package is the first on PyPI.
- **Kristin Sainani** — "How to Be a Statistical Detective" (*PM&R*, 2020). Pedagogical framework tying these tools together.
- **Thomas Speidel** — GeoConvention 2018 R notebook. Variable selection methods (redundancy analysis, LASSO, sparse PCA, power analysis) on the Hunt dataset, complementing Niccoli's Python implementations. Inspired the redundancy and power modules.
- **Michèle Nuijten et al.** — statcheck (*Behavioral Research Methods*, 2016). P-value recomputation methodology.

## For AI assistants

The `skills/` directory contains detection heuristics and decision trees for each module. If you're a coding assistant (Copilot, Claude Code, etc.), read `skills/OVERVIEW.md` first.

## Acknowledgments

**Kristin Sainani** — her paper "How to Be a Statistical Detective" (*PM&R*, 2020, 12(2):211–215, DOI: [10.1002/pmrj.12305](https://doi.org/10.1002/pmrj.12305)) inspired the Tier 1 arithmetic consistency approach and the overall "statistical detective" framing of this package. The `p_checker` module's pedagogical structure follows her framework of treating statistical anomalies as clues that warrant further investigation. More broadly, her detective framing runs through the entire package lineage: it directly inspired Matteo Niccoli's *"Be a geoscience and data science detective"* project ([MyCarta blog](https://mycartablog.com), [GitHub repo](https://github.com/mycarta/Be-a-geoscience-detective), TRANSFORM 2021 lightning talk), which is the primary source for the Tier 3 modules (`leverage.py`, `reproducibility.py`).

**Thomas Speidel** — his GeoConvention 2018 R notebook, *Data Science Tools for Petroleum Exploration and Production*, provided the methodology for the power analysis and redundancy modules (Tier 2). The original GeoConvention 2018 presentation was a collaboration between Matteo Niccoli and Thomas Speidel; Speidel's R implementations of power analysis and variable redundancy (`Hmisc::redun`), applied to the Hunt (2013) 21-well dataset, were translated into the Python `power` and `redundancy` modules in this package.

## License

Apache-2.0. The optional `statcheck` dependency is GPL-3.0.
