Metadata-Version: 2.4
Name: statskita
Version: 0.2.0
Summary: Python toolkit for Indonesian official microdata (SAKERNAS)
Project-URL: Homepage, https://github.com/okkymabruri/statskita
Project-URL: Documentation, https://statskita.readthedocs.io
Project-URL: Repository, https://github.com/okkymabruri/statskita
Project-URL: Issues, https://github.com/okkymabruri/statskita/issues
Author-email: Okky Mabruri <okkymbrur@gmail.com>
License: MIT
License-File: LICENSE
Keywords: employment,indonesia,labor,microdata,sakernas,statistics,survey
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Sociology
Requires-Python: >=3.10
Requires-Dist: dbfrs>=0.1.5
Requires-Dist: pandas>=2.3.2
Requires-Dist: polars>=0.20.0
Requires-Dist: pyarrow>=10.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pyreadstat>=1.2.0
Requires-Dist: python-dotenv>=1.1.1
Requires-Dist: pyyaml>=6.0.0
Requires-Dist: requests>=2.28.0
Requires-Dist: samplics>=0.4.0
Requires-Dist: typing-extensions>=4.5.0
Requires-Dist: xlsxwriter>=3.2.9
Provides-Extra: cli
Requires-Dist: rich>=13.0.0; extra == 'cli'
Requires-Dist: typer>=0.9.0; extra == 'cli'
Provides-Extra: dev
Requires-Dist: mypy>=1.5.0; extra == 'dev'
Requires-Dist: pre-commit>=3.0.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs-material>=9.0.0; extra == 'docs'
Requires-Dist: mkdocs>=1.5.0; extra == 'docs'
Requires-Dist: mkdocstrings[python]>=0.22.0; extra == 'docs'
Provides-Extra: notebook
Requires-Dist: ipykernel>=6.0.0; extra == 'notebook'
Requires-Dist: matplotlib>=3.6.0; extra == 'notebook'
Requires-Dist: seaborn>=0.12.0; extra == 'notebook'
Provides-Extra: viz
Requires-Dist: matplotlib>=3.6.0; extra == 'viz'
Requires-Dist: seaborn>=0.12.0; extra == 'viz'
Description-Content-Type: text/markdown

# StatsKita: Python toolkit for Indonesian official microdata (SAKERNAS)

> **v0.2.0-beta**: Multi-wave analysis with 100% validation. Production-ready for 4 waves (2023-02 to 2025-02).

## TL;DR

Python toolkit for SAKERNAS labor force survey data with multi-wave analysis.

## Quick Start

```python
import statskita as sk

# load SAKERNAS data (supports .dbf, .dta, .sav, .parquet)
df = sk.load_sakernas("sakernas_2025-02.parquet")

# wrangle and harmonize
clean_df = sk.wrangle(df, harmonize=True, source_wave="2025-02")

# declare survey design
design = sk.declare_survey(clean_df, weight="survey_weight", strata=None, psu="psu")

# calculate indicators
results = sk.calculate_indicators(
    design,
    indicators="all",
    as_table=True,
    include_ci=False
)
```

## Multi-Wave Analysis

```python
# load multiple waves
waves = ["2023-02", "2023-08", "2024-02", "2025-02"]
harmonized = {}

for wave in waves:
    df = sk.load_sakernas(f"sakernas_{wave}.parquet", wave=wave)
    harmonized[wave] = sk.wrangle(df, source_wave=wave, harmonize=True)

# compare across waves
results = sk.calculate_indicators_multi(
    harmonized,
    indicators="all",
    as_wide=True
)

print(results)
```

**Output** (wide-format comparison):
```
┌─────────────────────────────────┬──────┬─────────┬─────────┬─────────┬─────────┐
│ indicator                       ┆ unit ┆ 2023-02 ┆ 2023-08 ┆ 2024-02 ┆ 2025-02 │
├─────────────────────────────────┼──────┼─────────┼─────────┼─────────┼─────────┤
│ labor_force_participation_rate  ┆ %    ┆ ...     ┆ ...     ┆ ...     ┆ ...     │
│ employment_rate                 ┆ %    ┆ ...     ┆ ...     ┆ ...     ┆ ...     │
│ unemployment_rate               ┆ %    ┆ 5.45    ┆ 5.32    ┆ 4.82    ┆ 4.76    │
│ underemployment_rate            ┆ %    ┆ ...     ┆ ...     ┆ ...     ┆ ...     │
│ female_labor_force_participat…  ┆ %    ┆ ...     ┆ ...     ┆ ...     ┆ ...     │
│ average_wage                    ┆ M Rp ┆ ...     ┆ ...     ┆ ...     ┆ ...     │
└─────────────────────────────────┴──────┴─────────┴─────────┴─────────┴─────────┘
```

## Installation

```bash
pip install statskita
```

## Features

- **Multi-wave support**: Compare indicators across 4 validated waves (2023-02 to 2025-02)
- **Multiple formats**: Load .dbf, .dta, .sav, .parquet files
- **Config-driven**: Automatic harmonization across waves
- **Survey-aware**: Proper handling of weights, PSU, strata
- **Fast processing**: Polars backend for large datasets
- **Complete indicators**: LFPR, unemployment, underemployment, wages, and more

See examples/ directory for detailed usage.