Metadata-Version: 2.4
Name: DMeta
Version: 0.4
Summary: Removing microsoft office files' metadata
Home-page: https://github.com/openscilab/dmeta
Download-URL: https://github.com/openscilab/dmeta/tarball/v0.4
Author: DMeta Development Team
Author-email: dmeta@openscilab.com
License: MIT
Project-URL: Source, https://github.com/openscilab/dmeta
Keywords: python3 python metadata remove
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: End Users/Desktop
Classifier: Intended Audience :: Manufacturing
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Security
Classifier: Topic :: Utilities
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: AUTHORS.md
Requires-Dist: art>=1.8
Requires-Dist: defusedxml>=0.7.1
Requires-Dist: lxml>=5.2.2
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: download-url
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: project-url
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary


<div align="center">
    <img src="https://github.com/openscilab/dmeta/raw/main/otherfiles/logo.png" width="280" height="400">
    <br/>
    <br/>
    <a href="https://codecov.io/gh/openscilab/dmeta"><img src="https://codecov.io/gh/openscilab/dmeta/branch/dev/graph/badge.svg" alt="Codecov"></a>
    <a href="https://badge.fury.io/py/dmeta"><img src="https://badge.fury.io/py/dmeta.svg" alt="PyPI version" height="18"></a>
    <a href="https://www.python.org/"><img src="https://img.shields.io/badge/built%20with-Python3-green.svg" alt="built with Python3"></a>
    <a href="https://discord.gg/626twyuPZG"><img src="https://img.shields.io/discord/1064533716615049236.svg" alt="Discord Channel"></a>
</div>

----------

## Overview
<p align="justify">
DMeta is an open source Python package that removes metadata of Microsoft Office files.
</p>
<table>
    <tr>
        <td align="center">PyPI Counter</td>
        <td align="center">
            <a href="https://pepy.tech/projects/dmeta">
                <img src="https://static.pepy.tech/badge/dmeta" alt="PyPI Downloads">
            </a>
        </td>
    </tr>
    <tr>
        <td align="center">Github Stars</td>
        <td align="center">
            <a href="https://github.com/openscilab/dmeta">
                <img src="https://img.shields.io/github/stars/openscilab/dmeta.svg?style=social&label=Stars">
            </a>
        </td>
    </tr>
</table>
<table>
    <tr> 
        <td align="center">Branch</td>
        <td align="center">main</td>
        <td align="center">dev</td>
    </tr>
    <tr>
        <td align="center">CI</td>
        <td align="center">
            <img src="https://github.com/openscilab/dmeta/actions/workflows/test.yml/badge.svg?branch=main">
        </td>
        <td align="center">
            <img src="https://github.com/openscilab/dmeta/actions/workflows/test.yml/badge.svg?branch=dev">
            </td>
    </tr>
</table>


## Installation

### PyPI

- Check [Python Packaging User Guide](https://packaging.python.org/installing/)
- Run `pip install dmeta==0.4`
### Source code
- Download [Version 0.4](https://github.com/openscilab/dmeta/archive/v0.4.zip) or [Latest Source](https://github.com/openscilab/dmeta/archive/dev.zip)
- Run `pip install .`

## Usage
### In Python
⚠️ Use `in_place` to apply the changes directly to the original file.

⚠️`in_place` flag is `False` by default.

#### Clear metadata for a .docx file in place
```python
import os
from dmeta.functions import clear

DOCX_FILE_PATH = os.path.join(os.getcwd(), "sample.docx")
clear(DOCX_FILE_PATH, in_place=True)
```
#### Clear metadata for all existing microsoft files (.docx|.pptx|.xlsx) in the current directory
```python
from dmeta.functions import clear_all
clear_all()
```
#### Update metadata for a .pptx file in place
```python
import os
from dmeta.functions import update

CONFIG_FILE_PATH = os.path.join(os.getcwd(), "config.json") 
DOCX_FILE_PATH = os.path.join(os.getcwd(), "sample.pptx")
update(CONFIG_FILE_PATH, DOCX_FILE_PATH, in_place=True)
```
#### Update metadata for all existing microsoft files (.docx|.pptx|.xlsx) in the current directory
```python
import os
from dmeta.functions import update_all

CONFIG_FILE_PATH = os.path.join(os.getcwd(), "config.json") 
update_all(CONFIG_FILE_PATH)
```

### CLI
⚠️ You can use `dmeta` or `python -m dmeta` to run this program

⚠️ Use `--inplace` to apply the changes directly to the original file.


#### Clear metadata for a .docx file in place
```console
dmeta --clear "./test_a.docx" --inplace
```
#### Clear metadata for all existing microsoft files (.docx|.pptx|.xlsx) in the current directory
```console
dmeta --clear-all
```
#### Update metadata for a .xlsx file in place
```console
dmeta --update "./test_a.xlsx" --config "./config.json" --inplace
```
#### Update metadata for all existing microsoft files (.docx|.pptx|.xlsx) files in the current directory
```console
dmeta --update-all --config "./config.json"
```
#### Version
```console
dmeta -v
dmeta --version
```
#### Info
```console
dmeta --info
```

### Dmeta as pre-commit hook

To ensure that **no Microsoft Office files ever enter your repo with embedded metadata**, you can use Dmeta’s built-in pre-commit hooks.

#### 1. Install the pre-commit framework
If you don’t already have it:
```bash
pip install pre-commit
```

#### 2. Add Dmeta to your project’s .pre-commit-config.yaml
In your project root, create or update .pre-commit-config.yaml:
```yaml
repos:
  - repo: https://github.com/openscilab/dmeta.git
    rev: v0.4 # minimum v0.4 or commit SHA
    hooks:
      - id: clear-metadata
```
* `rev`: must exactly match the minimum tag supporting pre-commit hooks or the commit SHA where the targetted `.pre-commit-hooks.yaml` exists.

#### 3. Install the hook
```bash
pre-commit install # or pre_commit install (in windows)
```

Now, every time you `git commit`, Dmeta will automatically clear metadata from any Microsoft files in-place.

#### ⚠️ Important: Clean Before You Commit

Do **not** stage or add Microsoft Office files **before** removing their metadata.

If you run `git add` on Office files that still contain embedded metadata, the pre-commit hook will attempt to clean them **in-place**, which modifies the files after they’ve been staged. As a result, **Git will block the commit** because the content has changed mid-process.

#### ✅ Suggested Correct Workflow

1. Let the hook run automatically on earlier commits that didn’t add Office files, or run it manually. To do manually you can run `pre-commit run clear-metadata --all-files` 

2. Then:
   ```bash
   git add <cleaned-files>
   git commit -m "Your message"
   ```

## Supported files
| File format | support | 
| ---------------- | ---------------- | 
| Microsoft Word (.docx) | &#x2705; |
| Microsoft PowerPoint (.pptx) | &#x2705; |
| Microsoft Excel (.xlsx) | &#x2705; |


## Issues & bug reports

Just fill an issue and describe it. We'll check it ASAP! or send an email to [dmeta@openscilab.com](mailto:dmeta@openscilab.com "dmeta@openscilab.com"). 

- Please complete the issue template
 
You can also join our discord server

<a href="https://discord.gg/626twyuPZG">
  <img src="https://img.shields.io/discord/1064533716615049236.svg?style=for-the-badge" alt="Discord Channel">
</a>

## Acknowledgments

[Python Software Foundation (PSF)](https://www.python.org/psf/) granted DMeta library partially for version(s) 0.4.
[PSF](https://www.python.org/psf/) is the organization behind Python. Their mission is to promote, protect, and advance the Python programming language and to support and facilitate the growth of a diverse and international community of Python programmers.

<a href="https://www.python.org/psf/"><img src="https://github.com/openscilab/dmeta/raw/main/otherfiles/psf.png" height="65px" alt="Python Software Foundation"></a>


## Show your support


### Star this repo

Give a ⭐️ if this project helped you!

### Donate to our project
If you do like our project and we hope that you do, can you please support us? Our project is not and is never going to be working for profit. We need the money just so we can continue doing what we do ;-) .			

<a href="https://openscilab.com/#donation" target="_blank"><img src="https://github.com/openscilab/dmeta/raw/main/otherfiles/donation.png" height="90px" width="270px" alt="DMeta Donation"></a>

# Changelog
All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## [0.4] - 2025-06-16
### Added
- `Acknowledgments` in `README.md`
- `.pre-commit-config.yaml`
- `.pre-commit-hooks.yaml`
- DMeta pre-commit hook section in `README.md`
- recursive search in `clear_all` and `update_all`
- `--verbose` flag in CLI
- modern issue template structure
- `--info` flag in CLI
### Changed
- `get_microsoft_format` function in `util.py`
- `overwrite_metadata` function in `functions.py`
- `clear_all` function in `functions.py`
- `clear` function in `functions.py`
- `update_all` function in `functions.py` enhanced
- `update` function in `functions.py` 
### Removed
- Python 3.6 support
- old issue template structure
## [0.3] - 2025-01-13
### Removed
- `extract_namespaces` function in `util.py`
### Added
- `DMetaBaseError` added to `dmeta/__init__.py`
- `overwrite_metadata` function added to `functions.py`
### Changed
- `update` function in `functions.py` refactored
- `clear` function in `functions.py` refactored
- `README.md` updated
- GitHub actions are limited to the `dev` and `main` branches
- `Python 3.13` added to `test.yml`
## [0.2] - 2024-08-14
### Added
- `dmeta/errors.py`
- `pptx` and `xlsx` support
- `get_microsoft_format` function in `util.py`
- `SECURITY.md`
- `inplace` parameter in the `clear` function in `functions.py`
- `inplace` parameter in the `clear_all` function in `functions.py`
- `inplace` parameter in the `update` function in `functions.py`
- `inplace` parameter in the `update_all` function in `functions.py`
- `inplace` parameter in CLI
- `inplace` tests
### Changed
- `run_dmeta` in `functions.py`
- `read_json` in `util.py`
- `get_microsoft_format` in `util.py`
- error messages in `params.py`
- `clear` function in `functions.py`
- `extract` function in `util.py`
- `remove_format` function in `util.py`
- `clear` function in `functions.py`
- `clear_all` function in `functions.py`
- `update` function in `functions.py`
- `update_all` function in `functions.py`
- `extract_namespaces` function in `util.py`
- `README.md` updated
## [0.1] - 2024-06-19
### Added
- `CLI` handler
- `main` function in `__main__.py`
- `README.md`
- `clear` function in `functions.py`
- `clear_all` function in `functions.py`
- `update` function in `functions.py`
- `update_all` function in `functions.py`
- `run_dmeta` function in `functions.py`
- `dmeta_help` function in `functions.py`
- `extract_namespaces` function in `util.py`
- `remove_format` function in `util.py`
- `extract_docx` function in `util.py`
- `read_json` function in `util.py`

[Unreleased]: https://github.com/openscilab/dmeta/compare/v0.4...dev
[0.4]: https://github.com/openscilab/dmeta/compare/v0.3...v0.4
[0.3]: https://github.com/openscilab/dmeta/compare/v0.2...v0.3
[0.2]: https://github.com/openscilab/dmeta/compare/v0.1...v0.2
[0.1]: https://github.com/openscilab/dmeta/compare/9a4ad10...v0.1
