Metadata-Version: 2.4
Name: sequelframe
Version: 0.4.0
Summary: SQL interface for pandas DataFrames using SQLite with modern Python support
Author: Sina Mirshahi
Author-email: Sina Mirshahi <sina7th@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/Sinamirshahi/sequelframe
Project-URL: Repository, https://github.com/Sinamirshahi/sequelframe
Project-URL: Issues, https://github.com/Sinamirshahi/sequelframe/issues
Keywords: pandas,sql,sqlite,dataframe,query
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=1.0.0
Provides-Extra: excel
Requires-Dist: openpyxl>=3.0.0; extra == "excel"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Dynamic: author
Dynamic: license-file

# sequelframe

A lightweight, modern Python library to run SQL queries on pandas DataFrames using SQLite. Execute SQL commands on your DataFrames with ease!

[![Python Version](https://img.shields.io/pypi/pyversions/sequelframe)](https://pypi.org/project/sequelframe/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## Features

- Execute SQL queries on pandas DataFrames
- Support for multiple file formats (CSV, Excel, JSON, Parquet, Feather)
- Context manager support for automatic cleanup
- Type hints for better IDE support
- Parameterized queries for SQL injection protection
- Modern Python packaging (pyproject.toml)
- Zero configuration required

## Installation

```bash
pip install sequelframe
```

For Excel file support:
```bash
pip install sequelframe[excel]
```

## Quick Start

```python
from sequelframe import SequelFrame
import pandas as pd

# From a DataFrame
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [25, 30, 35],
    'city': ['New York', 'San Francisco', 'Los Angeles']
})

# Using context manager (recommended - auto cleanup!)
with SequelFrame(df) as db:
    result = db.query('SELECT * FROM data WHERE age > 25')
    print(result)
```

## Usage

### Initialization

```python
from sequelframe import SequelFrame

# From a pandas DataFrame
db = SequelFrame(df)

# From a CSV file
db = SequelFrame('data.csv')

# From an Excel file
db = SequelFrame('data.xlsx')

# From other formats
db = SequelFrame('data.json')
db = SequelFrame('data.parquet')
db = SequelFrame('data.feather')

# With custom table name
db = SequelFrame(df, table_name='customers')

# With verbose output
db = SequelFrame(df, verbose=True)

# With custom temp directory
db = SequelFrame(df, temp_dir='/tmp/mydb')
```

### Running SQL Queries

#### SELECT Queries

```python
# Basic SELECT
result = db.query('SELECT * FROM data WHERE age > 25')

# Using runsql (alternative method)
result = db.runsql('SELECT name, age FROM data ORDER BY age DESC')

# Parameterized queries (prevents SQL injection)
result = db.query('SELECT * FROM data WHERE name = ?', ('Alice',))
result = db.query('SELECT * FROM data WHERE age > ? AND city = ?', (25, 'New York'))
```

#### INSERT, UPDATE, DELETE

```python
# INSERT
db.runsql("INSERT INTO data (name, age, city) VALUES ('David', 28, 'Boston')")

# UPDATE
db.runsql("UPDATE data SET age = 31 WHERE name = 'Bob'")

# DELETE
db.runsql("DELETE FROM data WHERE age < 25")

# Using parameterized queries (safer)
db.runsql("INSERT INTO data (name, age, city) VALUES (?, ?, ?)",
          ('Eve', 27, 'Seattle'))
```

#### ALTER TABLE

```python
# Add a column
db.runsql('ALTER TABLE data ADD COLUMN salary INTEGER')

# Update the new column
db.runsql('UPDATE data SET salary = 50000 WHERE name = "Alice"')
```

### Utility Methods

```python
# Show all data (optionally limit rows)
db.show()
db.show(limit=10)

# Get the current DataFrame
current_df = db.get_dataframe()

# Update the entire DataFrame
new_df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
db.update_dataframe(new_df)

# Get column names
print(db.columns)  # ['name', 'age', 'city']

# Get shape
print(db.shape)  # (3, 3)

# String representation
print(db)  # SequelFrame(table='data', shape=(3, 3), status='open')
```

### Context Manager (Recommended)

The context manager automatically handles cleanup:

```python
# Automatic cleanup when done
with SequelFrame('data.csv') as db:
    result = db.query('SELECT * FROM data WHERE age > 25')
    db.runsql('UPDATE data SET age = age + 1')
    final_data = db.show()
# Database and temp files are automatically cleaned up here
```

### Manual Cleanup

If not using a context manager:

```python
db = SequelFrame('data.csv')
try:
    result = db.query('SELECT * FROM data')
finally:
    db.kill()  # Clean up database and temp files
```

## Advanced Examples

### Joining Data from Multiple DataFrames

```python
# Create two DataFrames
customers = pd.DataFrame({
    'customer_id': [1, 2, 3],
    'name': ['Alice', 'Bob', 'Charlie']
})

orders = pd.DataFrame({
    'order_id': [101, 102, 103],
    'customer_id': [1, 1, 2],
    'amount': [100, 150, 200]
})

# Load both into SQLite
with SequelFrame(customers, table_name='customers') as db:
    # Add the orders table
    orders.to_sql('orders', db._conn, if_exists='replace', index=False)

    # Join them
    result = db.query('''
        SELECT c.name, COUNT(o.order_id) as order_count, SUM(o.amount) as total
        FROM customers c
        LEFT JOIN orders o ON c.customer_id = o.customer_id
        GROUP BY c.customer_id, c.name
    ''')
    print(result)
```

### Complex Aggregations

```python
with SequelFrame('sales.csv') as db:
    monthly_report = db.query('''
        SELECT
            strftime('%Y-%m', date) as month,
            COUNT(*) as num_sales,
            SUM(amount) as total_sales,
            AVG(amount) as avg_sale,
            MAX(amount) as max_sale
        FROM data
        GROUP BY month
        ORDER BY month DESC
    ''')
```

### Window Functions

```python
with SequelFrame(df) as db:
    result = db.query('''
        SELECT
            name,
            age,
            ROW_NUMBER() OVER (ORDER BY age DESC) as rank,
            AVG(age) OVER () as avg_age
        FROM data
    ''')
```

## New in Version 0.4.0

- Modern packaging with `pyproject.toml`
- Full type hint support
- Context manager support (`with` statement)
- Parameterized queries for security
- Support for more file formats (JSON, Parquet, Feather)
- Better error handling and messages
- New utility methods (`get_dataframe()`, `update_dataframe()`)
- Property accessors (`columns`, `shape`)
- Improved cleanup handling
- Backward compatible with old `sequelframe` class name

## Migration from 0.3.x

The library is fully backward compatible. Your old code will continue to work:

```python
# Old style (still works)
from sequelframe import sequelframe
db = sequelframe('data.csv')
db.runsql('SELECT * FROM data')
db.kill()

# New style (recommended)
from sequelframe import SequelFrame
with SequelFrame('data.csv') as db:
    result = db.query('SELECT * FROM data')
```

## Development

Install development dependencies:

```bash
pip install -e ".[dev]"
```

Run tests:

```bash
pytest
```

Format code:

```bash
black sequelframe/
ruff check sequelframe/
```

Type checking:

```bash
mypy sequelframe/
```

## Why sequelframe?

- **Simple**: Just pass a DataFrame or file path and start querying
- **Powerful**: Full SQL support via SQLite
- **Safe**: Parameterized queries prevent SQL injection
- **Modern**: Type hints, context managers, and modern Python practices
- **Lightweight**: Minimal dependencies (just pandas)
- **Flexible**: Supports multiple file formats

## Limitations

- Uses SQLite, so you're limited to SQLite's SQL dialect
- Data is temporarily copied to a SQLite database (memory overhead)
- Not recommended for very large datasets (consider using DuckDB or Polars instead)

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

MIT License - see [LICENSE](LICENSE) file for details.

## Author

**Sina Mirshahi**
- Email: sina7th@gmail.com
- GitHub: [@Sinamirshahi](https://github.com/Sinamirshahi)

## Links

- [PyPI](https://pypi.org/project/sequelframe/)
- [GitHub Repository](https://github.com/Sinamirshahi/sequelframe)
- [Issue Tracker](https://github.com/Sinamirshahi/sequelframe/issues)
