Metadata-Version: 2.4
Name: csvsight
Version: 0.1.0
Summary: Eyeball a CSV in seconds: rows × columns, per-column type, null %, ranges, and top values. Zero dependencies, no pandas.
Author: yyfjj
License: MIT
Project-URL: Homepage, https://github.com/jjdoor/csvsight-py
Project-URL: Repository, https://github.com/jjdoor/csvsight-py
Project-URL: Issues, https://github.com/jjdoor/csvsight-py/issues
Keywords: csv,csv-stats,data,profiling,data-quality,summary,inspect,eda,cli,devtools
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Utilities
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Dynamic: license-file

# csvsight

**Eyeball a CSV in seconds.** Someone drops a CSV export on you and you just
want to know: how many rows, what columns, which ones are mostly empty, what's
the range of that amount field? `csvsight` prints exactly that. **Zero
dependencies — no pandas, no REPL.**

```bash
csvsight data.csv
```

```
data.csv — 10,432 rows × 5 columns  (comma, utf-8)

#  column   type    nulls         unique  detail
─  ───────  ──────  ────────────  ──────  ─────────────────────────────────────────
1  id       int     0 (0.0%)      10,432  min 1 · max 10432 · mean 5216.5
2  email    string  12 (0.1%)     10,411  e.g. "ada@example.com" · len 9–48
3  amount   float   34 (0.3%)     2,015   min 0.01 · max 9999 · mean 42.3
4  status   string  0 (0.0%)      3       active (61%) · churned (28%) · trial (11%)
5  country  string  120 (1.1%)    47      US (40%) · GB (12%) · DE (7%)
```

## Why

You don't need a DataFrame to answer "what's in this file?" — but the usual
tools make you spin one up anyway:

- **pandas** means `pip install pandas`, a Python session, and remembering the
  API for `.describe()` / `.isna().sum()` / `.nunique()`.
- **csvkit** is lovely but pulls in a handful of dependencies.
- **Excel** chokes on big files and isn't in your terminal.

`csvsight` is one command on a CSV. It auto-detects the delimiter, infers each
column's type, counts **10+ spellings of "missing"** (`NULL`, `N/A`, `nan`,
`-`, `none`, empty, …), and shows ranges for numbers and value distributions
for categorical columns.

## Usage

```bash
csvsight data.csv              # profile a file
cat data.csv | csvsight        # or read from stdin
csvsight data.tsv              # delimiter auto-detected (, tab ; |)
```

| Option | |
|---|---|
| `--delimiter <c>` | force the field delimiter |
| `--no-header` | treat the first row as data (columns named `col1`, `col2`, …) |
| `--top <n>` | top N values for categorical columns (default 3) |
| `--json` | emit the analysis as JSON instead of the table |

### What it reports per column

- **type** — `int` / `float` / `string` (inferred from the non-null values)
- **nulls** — count and percentage, recognizing many "missing" spellings
- **unique** — distinct non-null values
- **detail** — numbers get `min · max · mean`; low-cardinality columns get their
  value distribution; free-text columns get an example and length range

## Install

```bash
pip install csvsight        # Python >= 3.8
npx csvsight data.csv       # Node >= 18 (byte-for-byte port)
```

- PyPI: https://pypi.org/project/csvsight/
- npm: https://www.npmjs.com/package/csvsight
- GitHub: https://github.com/jjdoor/csvsight-py · [csvsight (Node)](https://github.com/jjdoor/csvsight)

## License

MIT
