Metadata-Version: 2.4
Name: langframe
Version: 0.3.0
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Dist: duckdb>=1.1.3
Requires-Dist: numpy>=2.0.0
Requires-Dist: openai>=1.60.1
Requires-Dist: pandas>=2.2.2
Requires-Dist: polars>=1.20.0
Requires-Dist: tiktoken>=0.9.0
Requires-Dist: pylance>=0.23.2
Requires-Dist: lancedb>=0.22.0
Requires-Dist: pyarrow>=19.0.0,<20.0.0 ; extra == 'cloud'
Requires-Dist: grpcio>=1.60.0 ; extra == 'cloud'
Requires-Dist: pydantic-settings ; extra == 'cloud'
Provides-Extra: cloud
License-File: LICENSE
Summary: langframe is a Python DataFrame library for processing text data with APIs inspired by PySpark.
It includes text-specific utilities and special operators called semantic operators,
which use LLMs to batch transform data.

Author-email: typedef <hello@typedef.ai>
License: Apache-2.0
Requires-Python: >=3.10, <3.13
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/typedef-ai/langframe
Project-URL: Issues, https://github.com/typedef-ai/langframe/issues

# langframe

**langframe** is a Python DataFrame library for processing **text data**, with APIs inspired by **PySpark**.  
It includes utilities tailored to working with language and special operations called **semantic operators**, which use large language models (LLMs) to batch-transform data.

---

## 📁 Directory Structure

The repository is organized as follows:

```bash
langframe/
├── src/langframe/            # Core library
│   ├── api/                  # Public API (e.g., DataFrame, Column, functions)
│   ├── _backends/            # Execution backends: local and cloud (Typedef)
│   ├── _logical_plan/        # Logical plan representation for operators
├── rust/                     # Rust crates compiled with Maturin
├── tests/                    # Test suite structured to match the source layout
```

---

## 🛠️ Development Setup

Local development requires uv and a Rust toolchain.

### First-Time Setup

From the root of the repo:

```bash
uv run maturin develop --uv
```

This will:

- Create a virtual environment
- Build the Rust crate
- Install Python dependencies
- Set up the package in editable mode

> This command also places the built dynamic Rust library inside `src/langframe`.

### Making Changes

- To apply changes made to Python code:

  ```bash
  uv sync
  ```

- To apply changes made to Rust code:

  ```bash
  uv run maturin develop --uv
  ```

  Add `--release` or `-r` to build the Rust crate in release mode (better performance).

---

## ✅ Running Tests

Run an individual test file:

```bash
uv run pytest tests/path/to/test_foo.py
```

Run all tests for the **local backend**:

```bash
uv run pytest -m "not cloud" tests
```

Run all tests for the **cloud backend**:

```bash
uv sync --extra cloud        # Installs cloud-specific dependencies
uv run pytest -m cloud tests
```

> ⚠️ Note: All tests require a valid OpenAI API key set in the environment variables.

---

## 📓 Running Notebooks (VSCode / Cursor)

To run demo notebooks:

1. Install the **Jupyter** extension.
2. Add the `.venv` path to **Python: Venv Folders** in VSCode settings:
   - Open settings: `Preferences: Open User Settings`
   - Go to Extensions → Python → **Python: Venv Folders**
3. Open a notebook, select the correct kernel from your virtual environment, and run cells.

> Restart the kernel to reflect any code changes made to the `langframe` source.

---

Have questions or want to contribute? Let us know!

