Metadata-Version: 2.4
Name: paddleformers
Version: 0.4.0
Summary: Easy-to-use and powerful NLP library with Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including Neural Search, Question Answering, Information Extraction and Sentiment Analysis end-to-end system.
Home-page: https://github.com/PaddlePaddle/PaddleFormers
Author: PaddleFormers Team
Author-email: paddleformers@baidu.com
License: Apache 2.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: blobfile
Requires-Dist: colorlog
Requires-Dist: decord
Requires-Dist: scikit-learn
Requires-Dist: multiprocess<=0.70.12.2
Requires-Dist: datasets>=3.0.0
Requires-Dist: tqdm
Requires-Dist: sentencepiece
Requires-Dist: huggingface_hub>=0.19.2
Requires-Dist: protobuf>=3.20.2
Requires-Dist: visualdl
Requires-Dist: safetensors
Requires-Dist: fast_dataindex>=0.1.1; platform_system == "Linux"
Requires-Dist: aistudio-sdk>=0.3.0
Requires-Dist: jinja2
Requires-Dist: regex
Requires-Dist: numpy<=1.26.4
Requires-Dist: tiktoken
Requires-Dist: ml_dtypes
Requires-Dist: tokenizers<=0.20.3; python_version <= "3.8"
Requires-Dist: tokenizers<0.22,>=0.21; python_version > "3.8"
Requires-Dist: omegaconf
Requires-Dist: modelscope
Requires-Dist: transformers>=4.55.1
Requires-Dist: triton>=3.1
Requires-Dist: use_triton_in_paddle
Requires-Dist: GPUtil
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

<p align="center">
  <img src="https://github.com/user-attachments/assets/9d1c1937-7fac-48f8-9d61-f7ac67b61b18" align="middle"  width="500" />
</p>

------------------------------------------------------------------------------------------

<p align="center">
    <a href=""><img src="https://img.shields.io/badge/python-3.8+-aff.svg"></a>
    <a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
    <a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
    <a href="https://github.com/PaddlePaddle/PaddleFormers/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleFormers?color=ccf"></a>
</p>

<h4 align="center">
    <a href=#news> News </a> |
    <a href=#highlights> Highlights </a> |
    <a href=#installation> Installation </a> |
    <a href=#quickstart> Quickstart </a> |
    <a href=#community> Community </a>
</h4>

**PaddleFormers** is a Transformer model library built on the [PaddlePaddle](https://www.paddlepaddle.org.cn) deep learning framework, delivering both **ease of use** and **high-performance capabilities**. It provides a unified model definition interface, modular training components, and comprehensive distributed training strategies specifically designed for large language model development pipelines. This enables developers to train large models efficiently with minimal complexity, making it suitable for diverse scenarios ranging from academic research to industrial applications.

## News

[2025/06/28] 🎉  **PaddleFormers 0.1** is officially released! This initial version supports SFT/DPO training paradigms, configurable distributed training via unified Trainer API, and integrates PEFT, MergeKit, and Quantization APIs for diverse LLM applications.

## Highlights

### ⚙️ Simplified Distributed Training
Implements 4D parallel strategies through unified Trainer API, lowering the barrier to distributed LLM training.

### 🛠 Efficient Post-Training
Integrates Packing dataflow and [FlashMask](https://arxiv.org/abs/2410.01359) operators for SFT/DPO training, eliminating padding waste and boosting throughput.

### 💾 Industrial Storage Solution
Features **Unified Checkpoint** storage tools for LLMs, enabling training resumption and dynamic resource scaling.  Additionally implements asynchronous storage (up to 95% faster) and Optimizer State Quantization (78% storage reduction), ensuring industrial training meets both efficiency and stability requirements.

## Installation

Requires Python 3.8+ and [PaddlePaddle](https://www.paddlepaddle.org.cn/install/quick) 3.1+.

```bash
# Install via pip
pip install paddleformers

# Install development version
git clone https://github.com/PaddlePaddle/PaddleFormers.git
cd PaddleFormers
pip install -e .
```

## Quickstart

### Text Generation

This example shows how to load Qwen model for text generation with PaddleFormers `Auto API`:


```python
from paddleformers.transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B-Base")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B-Base", dtype="bfloat16", convert_from_hf=True).eval()
input_features = tokenizer("Give me a short introduction to large language model.", return_tensors="pd")
outputs = model.generate(**input_features, max_new_tokens=128)
print(tokenizer.batch_decode(outputs[0], skip_special_tokens=True))
```


### SFT Training

Getting started with supervised fine-tuning (SFT) using PaddleFormers:
```python
from paddleformers.trl import SFTConfig, SFTTrainer
from datasets import load_dataset
dataset = load_dataset("ZHUI/alpaca_demo", split="train")

training_args = SFTConfig(output_dir="Qwen/Qwen3-0.6B-SFT", device="gpu", model_init_kwargs={"convert_from_hf": True})
trainer = SFTTrainer(
    args=training_args,
    model="Qwen/Qwen3-0.6B-Base",
    train_dataset=dataset,
)
trainer.train()
```

## Community

We welcome all contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

## License
This repository's source code is available under the [Apache 2.0 License](LICENSE).
