Metadata-Version: 2.4
Name: unxml-rs
Version: 1.2.0
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Rust
Classifier: Topic :: Text Processing :: Markup :: XML
Classifier: Topic :: Text Processing :: Markup :: HTML
Classifier: Topic :: Utilities
Summary: Pretty-print XML and HTML files in a light, YAML-like, readable format
Keywords: xml,html,yaml,pretty-print,cli,parser
Author-email: Ville Vainio <vivainio@gmail.com>
License: MIT
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/vivainio/unxml-rs
Project-URL: Repository, https://github.com/vivainio/unxml-rs

# Unxml

Simplify and "flatten" XML files into a YAML-like readable format.

This is a Rust clone of the original [unxml](https://github.com/vivainio/unxml) F# tool.

**[See it in action →](https://vivainio.github.io/unxml-demos/)** — a gallery of
real-world XML documents, schemas, stylesheets, and Schematron rules rendered
with `unxml`, with original-vs-rendered size comparisons.

## Installation

### Using uv (Easiest)

Install the published wheel from PyPI as a standalone tool:

```bash
uv tool install unxml-rs
```

This puts the `unxml` command on your PATH. To try it without installing anything:

```bash
uvx --from unxml-rs unxml <xml_file>
```

### Pre-built Binaries (Recommended)

Download the latest release for your platform from the [GitHub Releases](https://github.com/yourusername/unxml-rs/releases) page:

- **Linux (x86_64)**: `unxml-linux-x86_64.tar.gz`
- **Windows (x86_64)**: `unxml-windows-x86_64.zip`
- **macOS (Intel)**: `unxml-macos-x86_64.tar.gz`
- **macOS (Apple Silicon)**: `unxml-macos-arm64.tar.gz`

Extract the archive and place the `unxml` binary in your PATH.

### From Source

```bash
git clone https://github.com/yourusername/unxml-rs
cd unxml-rs
cargo install --path .
```

### Using Cargo

```bash
cargo install unxml
```

## Usage

```bash
unxml <xml_file>
```

By default files render as plain XML. Pass `--auto` to pick the processing mode
from each file's extension:

| Extension      | Mode applied   |
| -------------- | -------------- |
| `.xsl` `.xslt` | `--xslt`       |
| `.sch`         | `--schematron` |
| `.xsd`         | `--xsd`        |

An explicit mode flag (`--xslt`, `--schematron`, `--xsd`, `--special`) always
overrides autodetection.

Each mode rewrites its vocabulary into a terser pseudocode. The full set of
transformations, with side-by-side samples, is documented per format:

- [XSLT transformations](docs/xslt.md) — `xsl:*` stylesheets
- [XSD transformations](docs/xsd.md) — `xs:*` / `xsd:*` schemas
- [Schematron transformations](docs/schematron.md) — `.sch` rule schemas

### Syntax-highlighted output (`--bat`)

```bash
unxml --bat some.xsd      # implies --auto (detects --xsd), pipes through `bat -l unxml`
```

`--bat` renders the output through [`bat`](https://github.com/sharkdp/bat) using
the bundled `unxml` grammar (see `editor/`) for paged, colourised display. If
`bat` is not installed it falls back to plain stdout.

### Hiding noisy namespace prefixes (`--hide-ns`)

Vocabularies like UBL bury the signal under repeated prefixes (`cbc:`, `cac:`).
`--hide-ns` drops the named prefixes from element names — and their `xmlns:`
declarations — so the output reads as bare local names:

```bash
unxml --hide-ns cbc,cac invoice.xml   # repeatable and comma-separated
```

Signal-carrying prefixes you don't list (e.g. `ext:`, `bim:`) are kept, so an
extension subtree still stands out.

Under `--auto`/`--bat`, unxml also **sniffs** the document type and hides a
sensible set automatically. Currently it recognises UBL *instance* documents
(an unprefixed root such as `<Invoice>` in a UBL namespace) and hides whichever
prefixes are bound to the Common Basic/Aggregate Components namespaces. A
stylesheet or schema that merely *references* UBL (e.g. an `xsl:stylesheet`
translating to UBL) is left untouched, since there the prefixes are real syntax.

## Introduction

This command line application was developed for comparing XML files (e.g. database/application state dumps). It takes an XML file and converts it to a YAML-like syntax that is easier to read and compare.

### Example

Given this XML input:

```xml
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>netcoreapp2.1</TargetFramework>
    <PackAsTool>true</PackAsTool>
    <Description>Unxml 'pretty-prints' xml files in light, yamly, readable format</Description>
    <PackageVersion>1.0.0</PackageVersion>
  </PropertyGroup>
  <ItemGroup>
    <Compile Include="FileSystemHelper.fs"/>
    <Compile Include="MutableCol.fs"/>
    <Compile Include="Program.fs" />
  </ItemGroup>
</Project>
```

The output would be:

```
Project
  [Sdk]: Microsoft.NET.Sdk
  PropertyGroup
    OutputType = Exe
    TargetFramework = netcoreapp2.1
    PackAsTool = true
    Description = Unxml 'pretty-prints' xml files in light, yamly, readable format
    PackageVersion = 1.0.0
  ItemGroup
    Compile
      [Include]: FileSystemHelper.fs
    Compile
      [Include]: MutableCol.fs
    Compile
      [Include]: Program.fs
```

### Key Features

- **Attributes in Square Brackets**: Element attributes are displayed as `[attribute]: value`
- **Text Content with Equals**: Element text content is shown as `ElementName = text content`
- **Hierarchical Indentation**: Nested elements are properly indented
- **Clean Format**: Easy to read and compare, great for diffing

## Technical Details

- Built with Rust for performance and safety
- Uses `quick-xml` for fast XML parsing
- Uses `clap` for command-line argument parsing
- Proper error handling with `anyhow`

## License

MIT License - see LICENSE file for details.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

### Creating Releases

To create a new release:

1. Update the version in `Cargo.toml` and `src/main.rs`
2. Commit the version changes
3. Create and push a version tag:
   ```bash
   git tag v1.0.1
   git push origin v1.0.1
   ```
4. The GitHub Actions workflow will automatically build binaries for all platforms and create a release

The CI workflow runs on every push to ensure code quality with formatting checks, linting, and tests. 
