Metadata-Version: 2.4
Name: msxlsx
Version: 0.2.1
Summary: Python tools for SpreadsheetML workbooks, including MS-XLSX extension namespaces
Author: Anthony Shaw
License-Expression: MIT
Keywords: xlsx,excel,ooxml,ms-xlsx,office,spreadsheetml
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Office/Business
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: lxml>=6.1.0
Requires-Dist: pydantic>=2.0
Provides-Extra: dev
Requires-Dist: pytest>=9.0.3; extra == "dev"
Requires-Dist: pytest-cov>=5.0; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"
Requires-Dist: hypothesis>=6.0; extra == "dev"
Dynamic: license-file

# msxlsx

## Overview

`msxlsx` is a Python package for working with SpreadsheetML `.xlsx` files, including the published [MS-XLSX specification](https://learn.microsoft.com/en-us/openspecs/office_standards/ms-xlsx/). It provides extension features that are not part of the base OOXML workbook model, including slicers, timelines, sparklines, dynamic arrays, and advanced conditional formatting.

Install: `pip install -e /path/to/msxlsx` (or `pip install msxlsx` when published)

## Quick Reference

| Task | Approach |
|------|----------|
| Create new workbook | `Workbook.new()` |
| Open existing workbook | `Workbook.open("file.xlsx")` |
| Add worksheet | `wb.add_worksheet("Sheet2")` |
| Get worksheet by name | `wb.get_worksheet("Sheet1")` |
| Get sheet names | `wb.get_sheet_names()` |
| Set cell value | `ws.set_cell(row, col, value)` |
| Get cell value | `ws.get_cell(row, col)` |
| Get row values | `ws.get_row(row)` |
| Iterate rows | `for row in ws.iter_rows():` |
| Set formula | `ws.set_cell(1, 1, "=SUM(A2:A10)")` |
| Style a cell | `ws.set_cell_style(1, 1, bold=True, color="FF0000")` |
| Merge cells | `ws.merge_cells(1, 1, 2, 3)` |
| Freeze panes | `ws.freeze_panes(2, 1)` |
| Add table | `ws.add_table("Table1", "A1:C10", ["Col1", "Col2", "Col3"])` |
| Add comment | `ws.add_comment(1, 1, "Note text")` |
| Get comments | `ws.get_comments()` |
| Add image | `ws.add_image(1, 1, "photo.png", width_inches=3.0, height_inches=2.0)` |
| Save | `wb.save("output.xlsx")` |

## Creating Workbooks

```python
from msxlsx import Workbook

wb = Workbook.new()
ws = wb.worksheets[0]  # Default "Sheet1"

# Set cell values
ws.set_cell(1, 1, "Name")
ws.set_cell(1, 2, "Score")
ws.set_cell(2, 1, "Alice")
ws.set_cell(2, 2, 95)
ws.set_cell(3, 1, "Bob")
ws.set_cell(3, 2, 87)
ws.set_cell(4, 2, "=AVERAGE(B2:B3)")

# Style header row
ws.set_cell_style(1, 1, bold=True, fill="4472C4", color="FFFFFF")
ws.set_cell_style(1, 2, bold=True, fill="4472C4", color="FFFFFF")

# Column widths
ws.set_column_width(1, 20)
ws.set_column_width(2, 15)

wb.save("report.xlsx")
```

## Reading Workbooks

```python
from msxlsx import Workbook

wb = Workbook.open("report.xlsx")
print(wb.get_sheet_names())         # ["Sheet1", "Sheet2"]
ws = wb.get_worksheet("Sheet1")

# Read cell values
name = ws.get_cell(2, 1)            # "Alice"
score = ws.get_cell(2, 2)           # 95
formula = ws.get_cell(4, 2)         # "=AVERAGE(B2:B3)"

# Read a full row
row = ws.get_row(1)                 # ["Name", "Score"]

# Iterate all data rows
for row in ws.iter_rows(min_row=2):
    print(row)

# Get used range boundaries
min_row, min_col, max_row, max_col = ws.get_used_range()
```

## Comments (Notes)

```python
# Add comments to cells
ws.add_comment(1, 1, "This is a header", author="Alice")
ws.add_comment(2, 2, "Check this value")  # default author: "msxlsx"

# Read all comments
for c in ws.get_comments():
    print(f"{c['ref']}: {c['text']} (by {c['author']})")
```

## MS-XLSX Extension Features

### Sparklines (x14 namespace)

```python
from msxlsx.oxml.ms_extensions import make_sparkline_group

sparkline = make_sparkline_group(
    spark_type="line",
    data_range="Sheet1!B2:F2",
    location_range="Sheet1!G2",
    color="4472C4",
)
```

### Slicers (x14 namespace)

```python
from msxlsx.oxml.ms_extensions import make_slicer

slicer = make_slicer(
    name="Region Slicer",
    table_name="SalesTable",
    column_name="Region",
)
```

### Dynamic Arrays (xr namespace)

```python
from msxlsx.oxml.ms_extensions import make_dynamic_array_properties

props = make_dynamic_array_properties("A1")
```

### Extended Data Bars (x14 namespace)

```python
from msxlsx.oxml.ms_extensions import make_data_bar_ext

data_bar = make_data_bar_ext(
    ref="B2:B10",
    min_val="autoMin",
    max_val="autoMax",
    color="638EC6",
)
```

### Timelines (x15 namespace)

```python
from msxlsx.oxml.ms_extensions import make_timeline

timeline = make_timeline(
    name="Date Timeline",
    source_name="OrderDate",
    level=2,  # months
)
```

### Images in Worksheets

```python
wb = Workbook.new()
ws = wb.worksheets[0]
ws.add_image(2, 2, "logo.png", width_inches=4.0, height_inches=3.0)
wb.save("report.xlsx")
```

### Image Effects — Brightness/Contrast (a14 namespace)

```python
from msxlsx.oxml.ms_extensions import add_image_effects

add_image_effects(blip, brightness=20000, contrast=40000)
```

### Decorative Images (adec namespace)

```python
from msxlsx.oxml.ms_extensions import mark_image_decorative

mark_image_decorative(pic_element)
```

## Critical Rules

1. **Shared strings** — String cell values MUST reference the shared string table (type="s"), not inline text.
2. **mc:Ignorable** — Root elements using extension namespaces must declare `mc:Ignorable="x14 x15 x16 xr xr6"`.
3. **Content types** — Only emit content type overrides for parts that actually exist in the package.
4. **Element ordering** — workbook.xml: fileVersion, sheets, calcPr. worksheet: sheetViews, sheetFormatPr, cols, sheetData, mergeCells, drawing, tableParts.
5. **Cell references** — Use A1 notation (column letter + row number). Row/col are 1-indexed.
6. **Column letters** — A=1, B=2, ..., Z=26, AA=27, AB=28, etc.

## Dependencies

- **lxml** >= 5.0 (XML processing)
- Python >= 3.10
