Metadata-Version: 2.4
Name: tab-cli
Version: 0.1.9
Summary: A CLI tool for tabular data
Author-email: Tongfei Chen <tongfei@pm.me>
License-File: LICENSE
Requires-Python: >=3.10
Requires-Dist: blobfile>=3.0
Requires-Dist: fsspec>=2025.1.0
Requires-Dist: jmespath>=1.0
Requires-Dist: loguru>=0.7.3
Requires-Dist: polars-fastavro>=0.5.1
Requires-Dist: polars>=1.0
Requires-Dist: pyarrow>=15.0
Requires-Dist: rich>=13.0
Requires-Dist: typer>=0.21.1
Provides-Extra: az
Requires-Dist: adlfs>=2025.1.0; extra == 'az'
Requires-Dist: azure-identity>=1.10.0; extra == 'az'
Provides-Extra: gs
Requires-Dist: gcsfs>=2025.1.0; extra == 'gs'
Requires-Dist: google-auth>=2.27.0; extra == 'gs'
Provides-Extra: s3
Requires-Dist: boto3>=1.28.0; extra == 's3'
Requires-Dist: s3fs>=2025.1.0; extra == 's3'
Description-Content-Type: text/markdown

# tab

[![pypi](https://img.shields.io/pypi/v/tab-cli)](https://pypi.org/project/tab-cli/)

A CLI tool for viewing, querying, and converting tabular data files.
Reads CSV, TSV, JSON Lines, Parquet, and Avro -- locally or from S3, GCS, and Azure Blob Storage.

```sh
pip install tab-cli
```

**Documentation**: [tongfei.me/tab](https://tongfei.me/tab)

---

## Quick look

### View any tabular file

```bash
tab view data.csv
```

<p align="center">
  <img src="https://raw.githubusercontent.com/ctongfei/tab/refs/heads/gh-pages/assets/test.svg" alt="tab view" width="680">
</p>

### Query with SQL

The table is always available as `t`:

```bash
tab view --sql 'SELECT * FROM t WHERE Metric_A_Value > 80' data.csv
```

<p align="center">
  <img src="https://raw.githubusercontent.com/ctongfei/tab/refs/heads/gh-pages/assets/test-where.svg" alt="tab view --sql" width="680">
</p>

### Reshape rows with JMESPath

```bash
tab view --jp '{id: participant.id, city: profile.address.city}' data.parquet 
```

### Convert between formats

```bash
tab convert data.csv data.parquet
tab convert data.parquet data.jsonl -o jsonl
tab convert data.csv output_dir/ -o parquet -n 4   # partitioned
```

### Concatenate files

```bash
tab cat part1.csv part2.csv part3.csv -o jsonl > combined.jsonl
```

### Inspect schema and summary

```bash
tab schema data.parquet
tab summary data.parquet
```

### Read from stdin

```bash
curl -s https://example.com/data.csv | tab view -i csv -
```

### Read from cloud storage

```bash
tab view s3://bucket/path/data.parquet
tab view gs://bucket/path/data.csv
tab view az://container/path/data.jsonl
```

Install cloud extras as needed:

```sh
pip install 'tab-cli[s3]'    # AWS S3
pip install 'tab-cli[gs]'    # Google Cloud Storage
pip install 'tab-cli[az]'    # Azure Blob Storage
```

## Supported formats
 - csv
 - tsv
 - jsonl
 - parquet
 - avro
