Metadata-Version: 2.4
Name: finforge
Version: 2.0.0
Summary: Synthetic financial transaction data generation with persona-driven behavior simulation.
Author: Shivangi Shukla
Maintainer: FinForge maintainers
License-Expression: MIT
Project-URL: Homepage, https://github.com/shivangis22/finforge
Project-URL: Repository, https://github.com/shivangis22/finforge
Project-URL: Issues, https://github.com/shivangis22/finforge/issues
Project-URL: Documentation, https://github.com/shivangis22/finforge#readme
Project-URL: Changelog, https://github.com/shivangis22/finforge/blob/main/CHANGELOG.md
Keywords: synthetic-data,finance,transactions,simulation,testing
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Operating System :: OS Independent
Classifier: Topic :: Software Development :: Testing
Classifier: Topic :: Office/Business :: Financial
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=2.0
Requires-Dist: numpy>=1.24
Requires-Dist: faker>=24.0
Requires-Dist: pydantic>=2.6
Requires-Dist: python-dateutil>=2.8
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Dynamic: license-file

# FinForge v2.0.0

FinForge is a Python library for generating realistic synthetic financial transaction datasets with persistent behavioral identity, temporal balance consistency, and reproducible cashflow simulation.

FinForge v2.0.0 expands the engine beyond student and salaried users into richer financial lives: business owners, freelancers, households, retired users, mixed-population datasets, irregular income, business cashflow, and quarterly tax activity.

## Why FinForge v2.0.0 is different

FinForge is designed to simulate financial lives, not random rows.

- Persistent user identity: users carry stable behavioral traits such as `spending_style`, `merchant_loyalty`, `savings_tendency`, and `night_activity_score`.
- Temporal financial rhythm: balances evolve chronologically across salaries, subscriptions, tax, business income, bills, and discretionary spending.
- Realistic behavioral adaptation: low-balance users suppress discretionary activity, while high-liquidity users spend more freely without becoming unrealistic.
- Mixed real-world personas: v2 includes consumer, household, freelance, retirement, and business cashflow behavior in one framework.
- Reproducible synthetic data: the same seed and config generate the same dataset, which makes FinForge useful for testing, QA, analytics, and benchmarking.

## FinForge v2.0.0: Business & Irregular Income Simulation

New in v2:

- `business_owner` persona with business income, vendor payments, payroll, office rent, professional services, business travel, tax, and personal spending
- `freelancer` persona with irregular client/platform income and software/professional expenses
- `household` persona with family-oriented groceries, healthcare, education, insurance, and utility behavior
- `retired` persona with pension income, healthcare-heavy spending, and low discretionary intensity
- `mixed` persona mode for heterogeneous datasets
- irregular income engine with variable dates, amounts, and sources
- business cashflow engine with seasonal business income and quarterly tax payments
- business vs personal account simulation and flags
- expanded exported metadata for downstream testing and scenario analysis

## Features

- Persona-driven user generation
- Persistent behavioral identity traits
- Deterministic seed reproducibility
- Balance-aware spending suppression
- Session-based transaction bursts
- Stable subscription recurrence
- Explicit overdraft metadata
- Merchant/category consistency
- Mixed persona simulation
- Irregular income generation
- Business cashflow and seasonality
- Quarterly tax events
- CSV export and pandas DataFrame output

## Installation

```bash
pip install finforge
```

For local development:

```bash
pip install -e .[dev]
```

## Quickstart

```python
from finforge import DatasetGenerator

df = (
    DatasetGenerator(seed=42)
    .with_users(100)
    .with_persona("salaried")
    .for_months(6)
    .generate()
)

print(df.head())
```

Business owner example:

```python
from finforge import DatasetGenerator

df = (
    DatasetGenerator(seed=42)
    .with_users(5)
    .with_persona("business_owner")
    .for_months(6)
    .generate()
)
```

Mixed persona example:

```python
from finforge import DatasetGenerator

df = (
    DatasetGenerator(seed=42)
    .with_users(100)
    .with_persona("mixed")
    .for_months(6)
    .generate()
)
```

Mixed mode includes all supported v2 personas:

- `student`
- `salaried`
- `freelancer`
- `business_owner`
- `household`
- `retired`

When `user_count` is at least the number of supported personas, FinForge guarantees at least one user per persona. Remaining users are assigned with a deterministic weighted distribution driven by the configured seed.

Student behavioral example with CSV export:

```python
from finforge import DatasetGenerator

dataset = (
    DatasetGenerator(seed=101)
    .with_users(3)
    .with_persona("student")
    .for_months(2)
    .generate()
)

dataset.to_csv("transactionsBehaviour.csv", index=False)
```

## Supported personas

- `student`
- `salaried`
- `freelancer`
- `business_owner`
- `household`
- `retired`
- `mixed`

## Architecture overview

Core modules:

- `finforge.core`: models, enums, configuration, constants
- `finforge.personas`: persona definitions and recurring behavior
- `finforge.generators`: user generation, scheduling, transaction generation
- `finforge.merchants`: consumer and business merchant catalogs
- `finforge.exporters`: CSV export
- `finforge.dataset`: fluent public API

Behavior modules:

- `identity.py`: long-lived user behavioral traits
- `merchant_affinity.py`: preferred merchants and weighted reuse
- `adaptive_spending.py`: balance-aware and month-phase-aware spending controls
- `budgeting.py`: overspend memory and discretionary budget state
- `subscriptions.py`: dedicated subscription assignment
- `overdraft.py`: explicit negative-balance policy decisions
- `lifecycle.py`: irregular income, household flows, business cashflow, and quarterly tax activity
- `sessions.py`: clustered transaction sessions

LLM-related runtime behavior is intentionally not implemented. Any future AI extension is expected to remain compatible with local Ollama-only architecture.

## Behavioral identity engine

Every user exports stable identity metadata:

- `persona`
- `spending_style`
- `savings_tendency`
- `merchant_loyalty`
- `impulse_buying_score`
- `lifestyle_score`
- `night_activity_score`

These fields are not cosmetic. They directly influence:

- transaction frequency
- merchant reuse
- late-night behavior
- session probability
- discretionary suppression
- category mix

## Spending styles

FinForge uses four reusable spending styles across personas:

- `budget_conscious`
- `lifestyle_spender`
- `minimalist`
- `impulsive_student`

Expected effects:

- `minimalist`: fewest transactions, essentials-heavy, lower sessions
- `budget_conscious`: restrained discretionary behavior and strong balance sensitivity
- `lifestyle_spender`: higher food, shopping, entertainment, and weekend intensity
- `impulsive_student`: burstier activity, more late-night behavior, weaker spending discipline

## Subscription engine

Subscriptions are handled by a dedicated recurring system.

- Subscription merchants are separate from discretionary entertainment merchants
- Assigned subscriptions recur exactly once per month
- Subscription merchant and amount remain stable across months
- Subscription rows are marked with `is_subscription=True`
- Subscription rows are never session-linked discretionary entertainment noise

Supported subscription merchants include:

- Netflix
- Spotify
- Amazon Prime
- YouTube Premium

## Balance-aware spending and overdrafts

FinForge does not generate independent random balances.

Each transaction updates the running balance chronologically:

```text
balance_before + amount = balance_after
```

Behavior adapts to financial condition:

- low balance reduces discretionary probability and ticket sizes
- month-end stress suppresses entertainment and shopping
- overspending creates future pullback pressure
- overdrafts are either prevented or explicitly marked

Important metadata:

- `balance_state`
- `is_overdraft`
- `overdraft_amount`

## Business and irregular income simulation

FinForge v2 introduces non-salaried cashflow:

- freelancer income from clients and platforms such as Upwork, Fiverr, Stripe, and Razorpay
- business owner cashflow from client payments, settlements, vendor payments, inventory, payroll, and tax
- household secondary income or family transfers
- pension plus irregular interest/family support for retired users

Business owners also export:

- `account_type`
- `business_context`
- `is_business_transaction`
- `is_business_expense`
- `is_tax_related`
- `is_vendor_payment`
- `is_payroll`
- `seasonal_factor`

## Exported metadata columns

v1 metadata is preserved:

- `persona`
- `spending_style`
- `savings_tendency`
- `merchant_loyalty`
- `impulse_buying_score`
- `lifestyle_score`
- `night_activity_score`
- `is_recurring`
- `is_subscription`
- `is_discretionary`
- `recurrence_type`
- `session_id`
- `day_type`
- `balance_state`
- `is_overdraft`
- `overdraft_amount`

v2 metadata adds:

- `income_source`
- `expense_nature`
- `cashflow_type`
- `business_context`
- `account_type`
- `is_business_transaction`
- `is_business_expense`
- `is_tax_related`
- `is_vendor_payment`
- `is_payroll`
- `seasonal_factor`

## Example scripts

Examples are available in [examples](/Users/shivangishukla/Documents/resumeProjects/finforge/examples):

- [behavioral_generation.py](/Users/shivangishukla/Documents/resumeProjects/finforge/examples/behavioral_generation.py)
- [business_owner_generation.py](/Users/shivangishukla/Documents/resumeProjects/finforge/examples/business_owner_generation.py)
- [freelancer_generation.py](/Users/shivangishukla/Documents/resumeProjects/finforge/examples/freelancer_generation.py)
- [household_generation.py](/Users/shivangishukla/Documents/resumeProjects/finforge/examples/household_generation.py)
- [retired_generation.py](/Users/shivangishukla/Documents/resumeProjects/finforge/examples/retired_generation.py)
- [persona_comparison_v2.py](/Users/shivangishukla/Documents/resumeProjects/finforge/examples/persona_comparison_v2.py)

## Testing guarantees

The test suite covers:

- balance integrity
- chronological ordering
- merchant/category consistency
- seed reproducibility
- subscription recurrence and stability
- low-balance suppression
- session calibration
- business cashflow generation
- freelancer irregular income
- quarterly tax behavior
- mixed persona generation
- backward compatibility for v1 personas

Run the suite with:

```bash
pytest
```

## Roadmap

- richer scenario presets
- more regional merchant catalogs
- card vs bank-transfer vs wallet distinctions
- local Ollama-based narrative/explanation layers without changing the core simulation path

## Contributing

Contributions are welcome, especially around:

- new persona modules
- merchant catalog expansion
- calibration improvements
- documentation
- test coverage

Typical development flow:

1. Create a feature branch.
2. Add or update tests.
3. Run `pytest`.
4. Open a pull request with a clear behavioral rationale.

## License

MIT
