Metadata-Version: 2.4
Name: lifts
Version: 0.1.0
Summary: Latent imputation for time series with optional encoder calibration
Author: LIFTS maintainers
License: MIT
Project-URL: Homepage, https://github.com/sanyouwu/EHR
Project-URL: Repository, https://github.com/sanyouwu/EHR
Project-URL: Issues, https://github.com/sanyouwu/EHR/issues
Keywords: time-series,imputation,probabilistic,pytorch,state-space-model
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.24
Requires-Dist: matplotlib>=3.5
Requires-Dist: torch
Provides-Extra: dev
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"

# LIFTS

LIFTS is a package version of the latent factor imputation workflow developed in this repository. It keeps the original FBI implementation available internally, but exposes a cleaner estimator-style API for simulation experiments and downstream packaging.

The main entry point is `LIFTSImputer`, which follows the usual pattern:

```python
model = LIFTSImputer(...)
model.fit(X_train, mask_train)
samples = model.impute_samples(X_test, mask_test)
```

## Install

From this folder:

```bash
pip install -e .
```

For a fresh environment, install the listed runtime dependencies first if needed:

```bash
pip install -r requirements.txt
```

Release instructions for PyPI are in [RELEASE.md](RELEASE.md).

## Quickstart

```python
from lifts import LIFTSImputer, make_case4_dataset

data = make_case4_dataset(
	num_sample_train=200,
	num_sample_test=50,
	p=20,
	d=5,
	T=48,
	obs_rate=0.5,
	version="v1",
	seed=42,
)

model = LIFTSImputer(
	d=5,
	total_loops=20,
	batch_size=64,
	n_samples=20,
	lambda_selfmask=5.0,
	lambda_enc=1.0,
	enc_warmup=5,
	seed=42,
)

model.fit(data["X_train"], data["mask_train"], verbose=True)
samples = model.impute_samples(data["X_test"], data["mask_test"], n_samples=20)
print(samples.shape)  # (N, n_samples, p, T)
```

`p` and `T` can be passed to the constructor, or inferred from `X_train` during `fit`.

## Data Format

LIFTS expects numpy arrays with shape `(N, p, T)`:

- `X`: observed time series values, with missing entries either zero-filled or `nan`
- `mask`: binary observation mask with `1` for observed and `0` for missing
- returned imputation samples: `(N, n_samples, p, T)`

The dataset helpers return both observed and complete arrays:

- `X_train`, `mask_train`, `X_train_full`
- `X_test`, `mask_test`, `X_test_full`

## Key Parameters

- `lambda_selfmask`: weight for the self-mask imputation loss. Default is `5.0`.
- `lambda_enc`: weight for synthetic encoder calibration. Default is `0.0`, which preserves the original training behavior.
- `enc_warmup`: number of training loops before encoder calibration starts. Default is `50`.
- `enc_syn_batch`: number of synthetic trajectories used by encoder calibration. `None` resolves to `min(4000, N)` during training.
- `enc_syn_mini_batch`: chunk size for encoder calibration synthesis. `None` resolves to `min(batch_size, enc_syn_batch)`.

The training objective is:

```text
L_total = L_forecast + lambda_selfmask * L_selfmask + lambda_enc * L_encoder
```

`L_encoder` is only added when `lambda_enc > 0` and the warmup has completed.

## Included Simulation Cases

- `make_case1_dataset(...)`: linear factor state-space simulation
- `make_case3_dataset(...)`: nonlinear simulation variant
- `make_case4_dataset(..., version="v1")`: benchmark-compatible Case 4 used by the packaged examples
- `make_case4_dataset(..., version="v2")`: current local Case 4 generator

## Examples

Scripts:

- `examples/scripts/case1_minimal.py`
- `examples/scripts/case3_minimal.py`
- `examples/scripts/case4_minimal.py`

Notebooks:

- `examples/notebooks/case1_lifts.ipynb`
- `examples/notebooks/case4_lifts.ipynb`

The notebooks include `plot_probabilistic_imputation`, exported at the package root:

```python
from lifts import plot_probabilistic_imputation
```

## Backward Compatibility

The lower-level `FBIImputer` and `get_fbi_config` APIs remain importable:

```python
from lifts import FBIImputer, get_fbi_config
```

For new code, prefer `LIFTSImputer` because it exposes the common training controls directly in the constructor.
