Metadata-Version: 2.4
Name: wssweep
Version: 0.1.0
Summary: Zero-config whitespace doctor: find (and --fix) trailing whitespace, mixed line endings, missing/extra final newlines, BOMs and mixed indentation. Zero dependencies.
Author: yyfjj
License: MIT
Project-URL: Homepage, https://github.com/jjdoor/wssweep-py
Project-URL: Repository, https://github.com/jjdoor/wssweep-py
Project-URL: Issues, https://github.com/jjdoor/wssweep-py/issues
Keywords: whitespace,trailing-whitespace,line-endings,crlf,eol,lint,formatter,cli,ci
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Utilities
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Dynamic: license-file

# wssweep

**A zero-config whitespace doctor.** Run it on any repo and it instantly finds —
and with `--fix`, cleans — the whitespace problems that pollute diffs and break
across platforms: trailing whitespace, mixed CRLF/LF line endings, a missing
final newline, extra trailing blank lines, a UTF-8 BOM, lone CRs, and tabs mixed
with spaces in indentation.

```bash
pipx run wssweep
#   config.yml  (1)
#        -  mixed-eol  mixed line endings (CRLF×3, LF×1)
#
#   src/app.js  (2)
#        -  missing-final-newline  no newline at end of file
#       14: trailing-whitespace  trailing whitespace
#
#   ✖ 3 whitespace issues in 2 files  (missing-final-newline=1, mixed-eol=1, trailing-whitespace=1)

pipx run wssweep --fix   # clean them in place
```

No config file, no framework. Exits non-zero when it finds issues, so it drops
straight into CI. Pure standard library. Also on npm (`npx wssweep`) — the two
builds produce **byte-for-byte identical** output *and* identical fixes.

## Why another whitespace tool?

Because today this takes three or four tools wired together:

- **editorconfig-checker** reports, but needs you to author an `.editorconfig`
  first, and it can't fix anything.
- **pre-commit**'s `trailing-whitespace` / `end-of-file-fixer` /
  `mixed-line-ending` hooks *do* fix — but only inside the pre-commit framework,
  and they're three separate hooks. Nobody runs them ad-hoc on a fresh checkout.
- **prettier** fixes whitespace only as a side effect of reformatting all your
  code, is language-aware, and won't touch files it can't parse.
- **dos2unix** only does line endings.

`wssweep` is the one command — `pip`/`npx`, zero config — that reports *all*
seven whitespace smells at once with line numbers and optionally fixes them in
place, with a clean CI exit code, identical on Python and Node.

## What it checks

| check | what | `--fix` |
|---|---|---|
| `trailing-whitespace` | space/tab at end of a line | trims it |
| `mixed-eol` | a file containing **both** CRLF and LF | normalizes to LF |
| `lone-cr` | a bare CR (old-Mac line ending) | normalizes |
| `missing-final-newline` | non-empty file not ending in a newline | appends one |
| `trailing-blank-lines` | extra blank line(s) at end of file | collapses to one |
| `utf8-bom` | a leading UTF-8 BOM | strips it |
| `mixed-indentation` | tabs **and** spaces in one indent | report-only (needs your tab width) |

Opinionated, zero-config defaults: a consistently-CRLF file is **fine** (only
*mixed* endings are flagged), `.bat`/`.cmd` keep CRLF when fixed, and Markdown's
two-trailing-spaces hard line break is preserved (trailing-whitespace is skipped
in `.md`).

## Usage

```bash
wssweep                       # scan the current directory
wssweep src/ docs/            # scan specific paths
wssweep --fix                 # fix in place (atomic; only files that change)
wssweep --crlf --fix          # normalize endings to CRLF instead of LF
wssweep --skip=mixed-indentation   # turn off a check
wssweep --exclude='*.min.js'  # skip paths by glob (repeatable)
wssweep --json                # machine output (byte-identical both builds)
```

`.git`, `node_modules`, `dist`, `build`, `vendor`, `.venv` and friends are
skipped by default, as are binary files (detected by extension + a NUL-byte /
non-UTF-8 content check) and files over 5 MB. `--all` overrides those skips.

Exit codes: `0` clean · `1` issues found · `2` error. (`--fix` exits `0` once
everything fixable is fixed; a leftover `mixed-indentation` keeps it `1`.)

## How it works

It reads every file as **raw bytes** and scans a byte-faithful (latin-1) view, so
it never mangles encodings and the Python and Node builds agree to the byte: line
endings are classified from the bytes (never `splitlines`, which over-splits),
"whitespace" means exactly space and tab (never `\s`, which differs across
languages), and `--fix` writes raw bytes atomically (temp file + rename),
touching only files that actually change and preserving file modes. Fixing is
idempotent — run it twice, the second run does nothing.

## Install

```bash
pip install wssweep  # or pipx run wssweep
npm i -g wssweep     # Node build, identical behaviour
```

Python ≥ 3.8 or Node ≥ 18. No dependencies.

## License

MIT
