Metadata-Version: 2.4
Name: trainium
Version: 0.1.0
Summary: A Python library that automates your machine learning workflow from data preparation to model deployment
Author-email: anmatngu <nxan2911@gmail.com>
License-Expression: MIT
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: joblib>=1.5.0
Requires-Dist: matplotlib>=3.10.3
Requires-Dist: numpy>=2.2.5
Requires-Dist: pandas>=2.2.3
Requires-Dist: psutil>=7.0.0
Requires-Dist: pyyaml>=6.0.2
Requires-Dist: scikit-learn>=1.6.1
Requires-Dist: sqlalchemy>=2.0.41
Requires-Dist: tqdm>=4.67.1
Requires-Dist: types-pyyaml>=6.0.12.20250516
Dynamic: license-file

# Trainium – Autopilot Your ML 🚀 (In Development)

[![CI](https://github.com/lunovian/trainium/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/lunovian/trainium/actions/workflows/ci.yml)

**Trainium** is a Python library that aims to automate your machine learning workflow from data preparation to model deployment, delivering optimal results with minimal configuration. This project is currently in the early development stage.

## 🔍 Current Status

Trainium is currently in the early development phase. We are working on:

- Building the core architecture and interfaces
- Implementing base components and abstractions
- Setting up a testing framework

## ✨ Planned Features

### 1. **Intelligent Data Preprocessing**

- Automatic handling of missing values
- Smart feature scaling and normalization
- Categorical encoding optimization
- Feature selection and dimensionality reduction

### 2. **Adaptive Model Selection**

- Automatic problem type detection (classification/regression)
- Intelligent algorithm recommendations based on data characteristics
- Multi-model comparison and ensemble creation

### 3. **Advanced Training Pipeline**

- Automated hyperparameter optimization with multiple algorithms:
  - Random search
  - Bayesian optimization
  - Evolutionary algorithms
- Customizable early stopping criteria:
  - No improvement in stopping (patience-based)
  - Threshold-based stopping
  - Time limit stopping
  - Iteration limit stopping
- Cross-validation with configurable strategies
- Learning rate scheduling
- Resource-aware computation scaling

### 4. **Comprehensive Evaluation**

- Performance metrics tailored to your problem
- Interpretability and explainability tools
- Bias and fairness assessment
- Continuous improvement feedback loop

### 5. **Production-Ready Deployment**

- Model export in multiple formats
- Serialization and version control
- Inference API generation
- Monitoring and retraining capabilities

## 🛠 Installation (Coming Soon)

```bash
# Not yet available on PyPI
pip install trainium
```

## 🚀 Target Usage

```python
import trainium

# Load your dataset
dataset = trainium.load_data("your_data.csv")

# Train your model with one line
model = trainium.AutoTrain(target="target_column")

# Make predictions
predictions = model.predict(new_data)

# Export your model
model.export("my_trained_model")
```

## 📚 Documentation

Documentation is under development. Stay tuned for detailed guides, API references, and examples.

## 🤝 Contributing

We welcome contributions! Check out our [contribution guidelines](CONTRIBUTING.md) to get started.

## 📄 License

Trainium is released under the [MIT License](LICENSE).

## 📊 Development Roadmap (May 17th, 2025)

- [x] Core architecture design
- [x] Base class implementations
- [x] Testing infrastructure
- [x] Data preprocessing modules
- [x] Model selection and training pipeline
- [x] Evaluation framework
- [x] Deployment utilities
- [ ] Documentation and examples

## ✅ Continuous Integration & Code Quality

This project uses [uv](https://github.com/astral-sh/uv) for dependency management and a robust CI pipeline powered by GitHub Actions. Every push and pull request is automatically tested and linted across Python 3.9–3.12:

- **Testing:** Runs all unit tests with `pytest` (see `tests/` for coverage of core, evaluation, preprocessing, and pipeline modules).
- **Linting:** Enforces code quality with [ruff](https://github.com/astral-sh/ruff).
- **Formatting:** Checks code style with [black](https://github.com/psf/black).
- **Typing:** Ensures type correctness with [mypy](https://github.com/python/mypy).
