Metadata-Version: 2.4
Name: fpdf2-textindex
Version: 0.1.0
Summary: Text Index for fpdf2
Keywords: pdf,library,markdown,index,textindex
Author: Corvin Lasogga
Author-email: Corvin Lasogga <c.lasogga@gmail.com>
License-Expression: GPL-3.0-only
License-File: LICENSE
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Operating System :: OS Independent
Classifier: Topic :: Printing
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Markup
Classifier: Topic :: Multimedia :: Graphics
Classifier: Topic :: Multimedia :: Graphics :: Presentation
Classifier: Typing :: Typed
Requires-Dist: fpdf2>=2.8.7
Requires-Dist: typing-extensions>=4.15.0
Requires-Python: >=3.10
Project-URL: Code, https://github.com/CoLa5/fpdf2-textindex
Project-URL: Documentation, https://cola5.github.io/fpdf2-textindex/
Project-URL: Homepage, https://cola5.github.io/fpdf2-textindex/
Project-URL: Issue tracker, https://github.com/CoLa5/fpdf2-textindex/issues
Description-Content-Type: text/markdown

[![Pypi Latest Version](https://img.shields.io/pypi/v/fpdf2_textindex)](https://pypi.org/project/fpdf2_textindex#history)
[![Python Support](https://img.shields.io/pypi/pyversions/fpdf2_textindex)](https://pypi.org/project/fpdf2_textindex/)
[![Python Types](https://img.shields.io/pypi/types/fpdf2_textindex)](https://pypi.org/project/fpdf2_textindex/)
[![Documentation](https://img.shields.io/badge/docs-github.io-blue)](https://cola5.github.io/fpdf2-textindex)
[![License](https://img.shields.io/badge/license-GPLv3-blue.svg?style=flat)](https://www.gnu.org/licenses/gpl-3.0)

[![CI](https://github.com/CoLa5/fpdf2-textindex/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/CoLa5/fpdf2-textindex/actions/workflows/ci.yml)
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/CoLa5/fpdf2-textindex/blob/main/.pre-commit-config.yaml)
[![Coverage](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/CoLa5/d548905d5994ebc1c3f15e8cfb9003e2/raw/covbadge.json)](https://cola5.github.io/fpdf2-textindex/coverage/)
[![Pypi Trusted Publisher: enabled](https://img.shields.io/badge/Pypi_Trusted_Publisher-enabled-green.svg)](https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v1.json)](https://github.com/charliermarsh/ruff)
[![Checks: flake-8, isort, mypy](https://img.shields.io/badge/Checks-flake--8,_isort,_mypy-green.svg)](https://github.com/CoLa5/fpdf2-textindex/blob/main/.pyproject.toml)

[![GitHub last commit](https://img.shields.io/github/last-commit/CoLa5/fpdf2-textindex)](https://github.com/CoLa5/fpdf2-textindex/commits/main)

# fpdf2 Text Index

<img src="https://cola5.github.io/fpdf2-textindex/assets/logo.svg" title="fpdf2_textindex logo" width="50%"/>

Adds a **text index** to [fpdf2](https://github.com/py-pdf/fpdf2), based on the
documentation and source code of
[Math Gemmell's Text Index](https://mattgemmell.scot/textindex/):

```python3
from fpdf2_textindex import FPDF, TextIndexRenderer

pdf = FPDF()
pdf.add_page()
pdf.set_font('helvetica', size=12)
# Adding text index entry "example
pdf.cell(text="example{^}", markdown=True)
# Add the text index to a page
pdf.add_page()
pdf.insert_index_placeholder(TextIndexRenderer().render_text_index)
# Save as pdf
pdf.output("example.pdf")
```

The text index will have a single entry:

- example, 1

## Adding Text Index Entries

Use the [text index-syntax](https://mattgemmell.scot/textindex/) to define index
directives in a text:

> Most mechanical keyboard firmware{^} supports the use of [key
> combinations]{^}.

Print it in the PDF by enabling markdown in `fpdf2.FPDF.cell` or
`fpdf2.FPDF.multi_cell`:

```python3
pdf = FPDF()
pdf.add_page()
pdf.set_font('helvetica', size=12)
pdf.cell(
  text="Most mechanical keyboard firmware{^} supports the use of [key combinations]{^}.",
  markdown=True,
)
...
```

For a complete documentation of the supported text index directives, see the
[excellent documentation of Math Gemmell](https://mattgemmell.scot/textindex).

The only difference to this documentation is the adaption of the emphasis to the
[markdown style of fpdf2](https://py-pdf.github.io/fpdf2/TextStyling.html#markdowntrue).

So the text:

> This entry will be **\*\*emphasised\*\***{^} in the index.  
> This expanded entry will be **\*\*[not emphasised]{^"\* (nope)"}\*\*** in the
> index but here in the text.

will be printed in the PDF as:

> This entry will be **emphasised** in the index.  
> This expanded entry will be **not emphasised** in the index but here in the
> text.

Similarly, the marks for italics `__`, underline `--` and strikethrough `~~` are
supported.

## Inserting the Text Index

Use the adapted FPDF-class of this package that offers a
`fpdf2_textindex.FPDF.insert_index_placeholder`-method to define a placeholder
for the **text index**. At least, one page break is triggered after inserting
the text index:

```python3
...
pdf.add_page()
pdf.insert_index_placeholder(render_index_function)
```

Parameters:

- `render_index_function`: Function called to render the text index, receiving
  two parameters: `pdf`, an adapted `FPDF` instance, and `entries`, a list of
  `fpdf2_textindex.TextIndexEntry`s. A reference implementation is supported
  through `fpdf2_textindex.TextIndexRenderer.render_text_index`.
- `pages`: The number of pages that the text index will span, including the
  current one. A page break occurs for each page specified.
- `allow_extra_pages`: If `True`, allows unlimited additional pages to be added
  to the text index as needed. These extra text index pages are initially
  created at the end of the document and then reordered when the final PDF is
  produced.

> [!NOTE]
> Enabling `allow_extra_pages` may affect page numbering for headers or footers.
> Since extra text index pages are added after the document content, they might
> cause page numbers to appear out of sequence. To maintain consistent
> numbering, use **Page Labels** to assign a specific numbering style to the
> index pages. When using Page Labels, any extra text index pages will follow
> the numbering style of the first text index page

## Text Index Directive Syntax

```text
__example__{^foo>"\* text"#demo |bar;+baz>fiz [whiz] ~z !}
           1            2     3             4      5  6
```

The index directive in the example will:

1. Create a reference from the `"example text"` subentry within the `"foo"`
   top-level entry that leads to the directive's location in the text on the
   corresponding PDF page. If the entry or subentry do not exist, they will be
   created.
2. Define the alias `"#demo"` for the path to the subentry (`"foo"` >
   `"example text"`).
3. Adds cross-references to the entry:
   - A **SEE**-cross reference to the `"bar"` top-level entry,
   - A **SEE ALSO**-cross reference to the `"fiz"` subentry within the `"baz"`
     top-level entry.
4. Apply the `"whiz"` suffix to the directive's reference locator.
5. Sort the entry as if its heading starts with `"z"`.
6. Apply an emphasis (bold) to the mark's reference locator.

The resulting index with page numbers would look like:

- bar, 3
- baz
  - fiz, 5
- foo
  - example text, **6** (_see_ bar). _See also_ baz: fiz

if index directives with page references (locators) for `"bar"` and `"baz"` >
`"fiz"` have been added as well.  
In a real index, this would provoke an error, because either you set a reference
locator to a PDF page and a **SEE ALSO**-cross reference or a **SEE**- and a
**SEE ALSO**-cross reference, but not all three at the same time.

## Example

An example can be created by
[`example/textindex_figures.py`](https://github.com/CoLa5/fpdf2-textindex/blob/main/example/textindex_figures.py#L130)
and produces
[textindex_figures.pdf](https://cola5.github.io/fpdf2-textindex/assets/textindex_figures.pdf)
with all the examples from
[Math Gemmell's website](https://mattgemmell.scot/textindex/).

---

## Internals - Idea

For the curious reader:

This package adds a markdown parser to [fpdf2](https://github.com/py-pdf/fpdf2)
that intercepts markdown-styled strings to `fpdf2.FPDF.cell` or
`fpdf2.FPDF.multi_cell` and translates
[Math Gemmell's Text Index](https://mattgemmell.scot/textindex/)-directives into
markdown-links with an unset internal PDF link as destination, while the created
index entries are internally saved:

`"example{^}"`  
 **=**  
`"[example](#idx0)"`  
 **+**  
`TextIndexEntry(label="example", references=[Reference(start_id=0)])`

When creating the actual text index in the PDF, all unset internal PDF link
annotations that are related to the text index (identified by an unique id
schema) are collected and its page, x/y-position on the page added to the
entry's references:

`{"idx0": LinkLocation(page=3, x=20.0, y=40.0, ...), ...}`  
 **->**  
`TextIndexEntry.references[0].start_location = LinkLocation(page=3, x=20.0, y=40.0, ...)`

Finally, a `render_index_function` similar to the
[official TOC-implementation of fpdf2](https://py-pdf.github.io/fpdf2/DocumentOutlineAndTableOfContents.html#table-of-contents)
is used to render the index. The package supports a reference implementation,
but the user can implement its own version if necessary.

The reference `render_index_function` renders each index entry according to
[The Chicago Manual of Style - Indexes](https://www.chicagomanualofstyle.org/book/ed18/part3/ch15/toc.html):

`"example, 3"`

The unset link annotation in the text is pointed to this entry in the index and,
thus, is finally set.

In the reference implementation, inverted links are added as well: To create a
connection of the index entry to the text page, the printed page number will
point to the text page.  
So clicking on `"example"` on the text page will lead to corresponding entry in
the text index. Clicking on the reference (locator) in the text index, page
`"3"`, will return the reader to the text page. Cross-references are connected
in the same way but inside of the text index.
