Metadata-Version: 2.1
Name: datapasta
Version: 0.1.2
Summary: Paste data as Python DataFrame definitions
Author-Email: Louis Maddox <louismmx@gmail.com>
License: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python
Project-URL: Bug Tracker, https://github.com/lmmx/datapasta/issues
Project-URL: Homepage, https://github.com/lmmx/datapasta
Requires-Python: >=3.10
Requires-Dist: cliptargets>=0.1.2
Requires-Dist: pyperclip
Provides-Extra: dev
Requires-Dist: pdm-bump>=0.9.10; extra == "dev"
Requires-Dist: pdm>=2.22.3; extra == "dev"
Requires-Dist: pre-commit>=4.1.0; extra == "dev"
Requires-Dist: pytest>=8.3.4; extra == "dev"
Provides-Extra: docs
Requires-Dist: livereload>=2.7.1; extra == "docs"
Requires-Dist: mkdocs-extra-sass-mathshim>=0.1.0; extra == "docs"
Requires-Dist: mkdocs-material[imaging,recommended]>=9.5.2; extra == "docs"
Requires-Dist: mkdocs-section-index>=0.3.8; extra == "docs"
Requires-Dist: mkdocs>=1.5.3; extra == "docs"
Requires-Dist: mkdocstrings[python]>=0.24.0; extra == "docs"
Requires-Dist: ruff>=0.9.5; extra == "docs"
Requires-Dist: urllib3<2; extra == "docs"
Provides-Extra: pandas
Requires-Dist: pandas>=2.2.3; extra == "pandas"
Provides-Extra: polars
Requires-Dist: polars>=1.21.0; extra == "polars"
Provides-Extra: polars-lts-cpu
Requires-Dist: polars-lts-cpu>=1.21.0; extra == "polars-lts-cpu"
Description-Content-Type: text/markdown

# Enhanced HTML Table Support

When the `cliptargets` package is installed, datapasta gains the ability to extract tables directly from HTML content in the clipboard. This is especially useful when copying tables from web pages, spreadsheets, or other applications that place HTML content in the clipboard.

## Installation with HTML Table Support

```bash
# Install with HTML table support via cliptargets
pip install "datapasta[targets]"

# For complete installation (pandas, polars, and HTML support)
pip install "datapasta[full]"
```

## Benefits of HTML Table Support

- **Improved Header Detection**: Automatically detects table headers based on HTML structure (`<thead>` or `<th>` elements)
- **Better Structure Preservation**: Extracts data directly from HTML table structure, preserving rows and columns correctly
- **No Delimiter Issues**: Avoids problems with delimiter guessing and text parsing
- **Works with Web Content**: Ideal for copying tables from websites and web applications

## Using with HTML Tables

The API remains the same - datapasta automatically detects if HTML content is available in the clipboard and uses it when appropriate:

```python
import datapasta

# Will automatically use HTML table content if available
code = datapasta.clipboard_with_targets_to_pandas()
print(code)
```

### GitHub Artifacts example

If you go to the GitHub Actions results summary page you see a HTML table.
datapasta will generate the DataFrame code for you from the clipboard :magic_wand:

```
(datapasta) louis 🚶 ~/dev/datapasta $ datapasta --polars
import polars as pl

df = pl.DataFrame({
    'Name': ['wheels-linux-aarch64', 'wheels-linux-armv7', 'wheels-linux-ppc64le',
'wheels-linux-s390x'],
    'Size': ['4.2 MB', '3.78 MB', '4.63 MB', '5.5 MB'],
})
(datapasta) louis 🚶 ~/dev/datapasta $ python -ic "$(datapasta --polars)"
>>> print(df)
shape: (4, 2)
┌──────────────────────┬─────────┐
│ Name                 ┆ Size    │
│ ---                  ┆ ---     │
│ str                  ┆ str     │
╞══════════════════════╪═════════╡
│ wheels-linux-aarch64 ┆ 4.2 MB  │
│ wheels-linux-armv7   ┆ 3.78 MB │
│ wheels-linux-ppc64le ┆ 4.63 MB │
│ wheels-linux-s390x   ┆ 5.5 MB  │
└──────────────────────┴─────────┘
```

## Command Line Usage

```bash
# Automatically uses HTML table content if available
datapasta

# Force using legacy clipboard access (no HTML support)
datapasta --legacy
```

## How It Works

1. datapasta checks if the `cliptargets` package is available
2. If available, it looks for the `text/html` target in the clipboard
3. If HTML content is found, it extracts tables using a lightweight HTML parser
4. It detects headers based on HTML structure (`<thead>` or `<th>` elements)
5. If no HTML content is found or no tables are present, it falls back to the text-based parsing

This feature is particularly useful when copying tables from web applications, where the HTML structure provides more reliable information about the table's layout and headers than plain text.
