Metadata-Version: 2.4
Name: dataview-tui
Version: 0.2.0
Summary: Keyboard-driven TUI for browsing columnar data files.
Project-URL: Homepage, https://github.com/lyne7-sc/dataview-tui
Project-URL: Issues, https://github.com/lyne7-sc/dataview-tui/issues
Project-URL: Source, https://github.com/lyne7-sc/dataview-tui
Author: lyne7-sc
Maintainer: lyne7-sc
License-Expression: MIT
License-File: LICENSE
Keywords: arrow,data-viewer,duckdb,parquet,textual,tui
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Environment :: Console :: Curses
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.10
Requires-Dist: duckdb>=0.9
Requires-Dist: fastavro>=1.9
Requires-Dist: pyarrow>=12
Requires-Dist: textual>=0.40
Requires-Dist: tree-sitter-sql>=0.3.11
Requires-Dist: tree-sitter>=0.25
Provides-Extra: dev
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: pytest-asyncio; extra == 'dev'
Requires-Dist: ruff; extra == 'dev'
Provides-Extra: lance
Requires-Dist: pylance>=0.20; extra == 'lance'
Description-Content-Type: text/markdown

# dataview-tui

`dataview-tui` is a keyboard-driven terminal UI for inspecting columnar data
files. The command line entry point is `dview`.

The project supports Parquet, ORC, Avro, and optional Lance datasets. It is
built for quickly moving between schema, data preview, metadata, and SQL query
views without leaving the terminal.

## Features

- Textual-based terminal application with a project-specific dark theme.
- Automatic format detection by extension and magic bytes.
- PyArrow-backed Parquet and ORC readers.
- fastavro-backed Avro reader.
- Optional Lance reader via the `lance` extra.
- Schema, Data, Metadata, and SQL sidebar views.
- Schema tree entries include field type and nullable state.
- Metadata view includes file info, row groups/fragments/blocks, and per-column details.
- DuckDB SQL engine exposing the opened file as `data` and `t`.
- CLI entry points: `dview` and `dataview`.

## Installation

```bash
pipx install dataview-tui
dview --version
```

For Lance datasets:

```bash
pipx install "dataview-tui[lance]"
```

To update an existing install:

```bash
pipx upgrade dataview-tui
```

To install the current main branch from source:

```bash
pipx install git+https://github.com/lyne7-sc/dataview-tui.git
```

## Quick Start

Open supported files or datasets:

```bash
dview path/to/file.parquet
dview path/to/file.orc
dview path/to/file.avro
dview path/to/dataset.lance
```

Choose an initial tab:

```bash
dview path/to/file.parquet --tab data
```

## Keyboard

| Key | Action |
| --- | --- |
| `q` | Quit |
| `1` / `2` / `3` | Show Schema / Data / Metadata tab |
| `]` / `[` | Next page / previous page |
| `Ctrl+F` | Filter columns by regex |
| `/` | Search cell contents |
| `n` / `Shift+N` | Next / previous match |
| `Enter` | Show full current cell value |
| `s` | Toggle SQL sidebar |
| `Ctrl+Enter` / `Ctrl+J` / `F5` | Run SQL |

## SQL

The SQL engine exposes the opened file as DuckDB views named `data` and `t`:

```sql
SELECT * FROM data LIMIT 20;
SELECT count(*) FROM t;
```

Query results are capped by the engine to keep the terminal responsive.

## Terminal Setup

The app cannot force a terminal font. For a clearer table and SQL experience,
use a monospaced font with readable punctuation and digits, such as JetBrains
Mono or Maple Mono.

## Development

Install dependencies and run checks:

```bash
uv sync --extra dev
uv run pytest -q
uv run ruff check .
uv build
```

Project layout:

```text
src/dataview/
  app.py
  cli.py
  readers/
  widgets/
tests/
```

## Publishing

Releases are built by GitHub Actions when a GitHub Release is published. PyPI
publishing uses PyPI Trusted Publishing, so the PyPI project must be configured
with this repository and `.github/workflows/release.yml` before publishing a
release.

## Dependencies

Runtime:

- `pyarrow>=12`
- `textual>=0.40`
- `fastavro>=1.9`
- `duckdb>=0.9`
- `tree-sitter>=0.25`
- `tree-sitter-sql>=0.3.11`

Optional:

- `pylance>=0.20` via the `lance` extra

Development:

- `pytest`
- `pytest-asyncio`
- `ruff`

## License

MIT
