Metadata-Version: 2.4
Name: dolma-rust-components
Version: 1.3.0.dev0
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Rust
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Text Processing
Classifier: Typing :: Typed
Summary: Rust components for Dolma - Toolkit for pre-processing LLM training data.
Author-email: Allen Institute for Artificial Intelligence <contact@allenai.org>
License: Apache-2.0
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/allenai/dolma

# dolma-rust-components

Rust components for Dolma - a toolkit for pre-processing large language model training data.

This package contains the low-level Rust implementations that provide high-performance data processing capabilities for the Dolma toolkit.

