Metadata-Version: 2.4
Name: inbox-report
Version: 1.0.0
Summary: Turn local mailbox exports into job, COOP, and Tamheer application reports.
Author: gqnxx
License-Expression: MIT
Project-URL: Homepage, https://github.com/gqnxx/the-GOAT
Project-URL: Repository, https://github.com/gqnxx/the-GOAT
Project-URL: Issues, https://github.com/gqnxx/the-GOAT/issues
Keywords: email,mbox,eml,coop,tamheer,job-applications,saudi-arabia
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: End Users/Desktop
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Communications :: Email
Classifier: Topic :: Office/Business
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: pdf
Requires-Dist: reportlab>=4.0; extra == "pdf"
Requires-Dist: arabic-reshaper>=3.0; extra == "pdf"
Requires-Dist: python-bidi>=0.6; extra == "pdf"
Provides-Extra: dev
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Dynamic: license-file

# Inbox Application Reporter

Make a messy job-application inbox behave.

This reads an `.mbox` or `.eml` export, finds likely job/coop application emails, groups them by company, guesses status, and writes CSV, HTML, and PDF outputs.

It is local-first on purpose: no email password, no mailbox login, no cloud upload, no mysterious "please grant full inbox access" jump scare.

## Origin Story

This started because someone had more than 200 applications scattered across email and wanted a PDF before the inbox became a second job. The first version came from the most honest product-management format known to humanity: a panicked chat message and a screenshot.

Shoutout to [@justAbdulaziz10](https://github.com/justAbdulaziz10), the original chaos coordinator. Without that inbox situation, this tool would still be a napkin idea pretending to be a roadmap.

## Features

- Reads Gmail/Google Takeout-style `.mbox`, Apple Mail `.mbox` packages, and Outlook-friendly `.eml` folders.
- Detects likely job, COOP/cooperative training, Tamheer, internship, trainee, and application emails.
- Supports English and Arabic matching terms.
- Handles Saudi phrases like `التدريب التعاوني`, `برنامج التدريب التعاوني`, `التدريب على رأس العمل`, and `تمهير`.
- Groups emails by guessed organization.
- Classifies application type as `coop`, `tamheer`, `internship`, `graduate_program`, `job`, `training`, `career_portal`, or `unknown_application`.
- Infers rough status: `submitted_or_received`, `under_review`, `action_required`, `interview`, `offer_or_accepted`, `start_or_onboarding`, `ineligible`, `closed_or_full`, `rejected`, or `possible_application`.
- Adds `confidence` and `review_bucket`; only high-confidence rows with clear type and status are `auto_classified`.
- Extracts links, sender details, subjects, snippets, and dates.
- Writes editable CSV files plus organized HTML/PDF reports.

## Fast Path

```bash
git clone https://github.com/gqnxx/the-GOAT.git
cd the-GOAT
make demo
```

That creates a fake mailbox and writes demo outputs under `.demo/`.

If you are using a real inbox, read the [Export Guide](docs/export-guide.md) first.

Useful commands:

```bash
make help
make check
make test
make demo
make report INPUT=/path/to/Mail.mbox
make audit INPUT=/path/to/Mail.mbox
make path-smoke
make agent-check
```

## Install

Python 3.9+ is supported.

```bash
git clone https://github.com/gqnxx/the-GOAT.git
cd the-GOAT
python3 -m pip install -r requirements.txt
```

Local package install:

```bash
python3 -m pip install -e ".[pdf]"
inbox-report --version
```

After the PyPI release, install it with:

```bash
python3 -m pip install "inbox-report[pdf]"
inbox-report --version
```

The core CSV/HTML flow uses Python's standard library. PDF output uses `reportlab`; Arabic shaping in the PDF uses `arabic-reshaper` and `python-bidi`.

## Usage

1. Export Gmail from [Google Takeout](https://takeout.google.com/) and include Mail only, or export an `.eml` folder from a desktop mail client.
2. Wait for the export email. Gmail Takeout can take minutes, hours, or longer for large mailboxes.
3. Download and unzip the Takeout archive.
4. Find the `.mbox` file inside the Mail folder.
5. Run:

Gmail / MBOX:

```bash
inbox-report /path/to/Mail.mbox
```

Same thing through Make:

```bash
make report INPUT=/path/to/Mail.mbox
```

Strict mode is the default. It is intentionally conservative and filters out social digests, store orders, newsletters, and weak keyword matches. If you expected results and got an empty report, use audit mode to inspect noisy candidates:

```bash
inbox-report /path/to/Mail.mbox --include-weak
make audit INPUT=/path/to/Mail.mbox
```

Audit mode is for review, not final proof. Treat `needs_review` rows as "look at this manually", not as confirmed applications.

EML folder:

```bash
inbox-report /path/to/exported-emails/
```

Direct script usage also works:

```bash
python3 inbox_application_reporter.py /path/to/Mail.mbox
```

## Gmail Export Quick Steps

1. Open [Google Takeout](https://takeout.google.com/).
2. Click **Deselect all**.
3. Enable **Mail** only.
4. Click **Next step**.
5. Choose `.zip` and **Send download link via email**.
6. Click **Create export**.
7. Wait for Google's email.
8. Download, unzip, and find the `.mbox` file inside the Mail folder.

Large exports are normal. If it does not arrive immediately, wait; it does not mean the export failed.

Custom output paths:

```bash
inbox-report /path/to/Mail.mbox \
  --out details.csv \
  --summary-out companies.csv \
  --html-out report.html \
  --pdf-out report.pdf
```

Skip PDF generation:

```bash
inbox-report /path/to/Mail.mbox --no-pdf
```

Version:

```bash
inbox-report --version
```

Need the export steps? Start here:

- [Export Guide](docs/export-guide.md)
- [How It Works](docs/how-it-works.md)
- [PyPI Publishing](docs/pypi-publishing.md)
- [Roadmap](docs/roadmap.md)
- [LinkedIn Post Draft](docs/linkedin-post.md)

## PyPI

The PyPI package name is `inbox-report`. Publishing is configured through GitHub Actions trusted publishing, so no PyPI token is needed in the repo.

On PyPI's **Trusted Publisher Management** page, add a pending GitHub publisher with:

- PyPI Project Name: `inbox-report`
- Owner: `gqnxx`
- Repository name: `the-GOAT`
- Workflow name: `publish.yml`
- Environment name: `pypi`

Then publish by creating a GitHub Release or manually running the `publish` workflow from GitHub Actions.

## Outputs

- `applications.csv`: every matched email with sender, date, subject, guessed organization, application type, status, confidence, review bucket, links, matched terms, and snippet.
- `applications_summary.csv`: one row per guessed organization with counts, first/last seen dates, status counts, type counts, review counts, domains, and latest subject.
- `applications_report.html`: browser-friendly report grouped by organization.
- `applications_report.pdf`: PDF report when optional PDF dependencies are installed.

## Privacy Model

This tool reads a local export file. It does not send email, log in to Gmail, modify messages, upload data, or call an external API.

Mailbox exports can contain sensitive personal data. Treat `.mbox`, CSV, HTML, and PDF outputs as private unless you intentionally redact and share them.

## Accuracy Guardrails

The classifier is rule-based and tested with synthetic fixtures for Gmail-style confirmations, LinkedIn, Workday, Greenhouse, Lever, SmartRecruiters, Saudi COOP, Tamheer, interviews, offers, rejections, onboarding, store orders, newsletters, social digests, and marketing emails.

Strict mode favors precision. Audit mode exists for recall and manual review. Do not present `needs_review` rows as confirmed applications.

## Contributing

PRs are welcome. The best PRs make detection better without making the privacy story worse.

Good areas:

- better status detection
- more ATS domains
- safer organization guessing
- tests with synthetic fixtures
- packaging

Please do not include real mailbox exports, personal emails, screenshots of inboxes, or private identifiers in issues or PRs.
