Metadata-Version: 2.4
Name: flint-core
Version: 0.2.1
Summary: A minimalist, agnostic Python framework to standardize data engineering pipelines.
License: MIT
License-File: LICENSE
Keywords: data-engineering,pyspark,pandas,data-catalog
Author: idperez720
Author-email: ivandavidperez4@gmail.com
Requires-Python: >=3.11,<4.0.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Provides-Extra: all
Provides-Extra: aws
Provides-Extra: azure
Provides-Extra: gcp
Provides-Extra: pandas
Provides-Extra: spark
Requires-Dist: adlfs (>=2024.10.0) ; extra == "all"
Requires-Dist: adlfs (>=2024.10.0) ; extra == "azure"
Requires-Dist: click (>=8.4.1,<9.0.0)
Requires-Dist: gcsfs (>=2024.10.0) ; extra == "all"
Requires-Dist: gcsfs (>=2024.10.0) ; extra == "gcp"
Requires-Dist: pandas (>=3.0.3,<4.0.0) ; extra == "all"
Requires-Dist: pandas (>=3.0.3,<4.0.0) ; extra == "pandas"
Requires-Dist: pyspark (>=4.1.2,<5.0.0) ; extra == "all"
Requires-Dist: pyspark (>=4.1.2,<5.0.0) ; extra == "spark"
Requires-Dist: pyyaml (>=6.0.3,<7.0.0)
Requires-Dist: s3fs (>=2024.10.0) ; extra == "all"
Requires-Dist: s3fs (>=2024.10.0) ; extra == "aws"
Project-URL: Repository, https://github.com/idperez720/flint-core
Description-Content-Type: text/markdown

# Data Engineering Experience 🚀

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![Python Version](https://img.shields.io/badge/python-3.11%20%7C%203.12%20%7C%203.14-blue)](https://www.python.org)

**Flint** is a minimalist, agnostic Python framework designed to streamline and standardize data engineering pipelines. By embracing **Convention over Configuration**, `flint` eliminates environment friction, absolute path hardcoding, and complex PySpark session management.

---

## ✨ Key Features

* **Zero-Config File Discovery:** Automatic tree-walking directory resolution anchors your data catalog using your local `pyproject.toml` file.
* **Decentralized Catalog:** Declare your metadata layouts inside modular, self-contained mini-YAML files.
* **Elastic Processing Runtimes:** Switch dynamically between **Pandas** and **PySpark** execution engines using exactly the same unified interface.
* **Interactive CLI Scaffolding:** Spin up a new production-ready data directory structure instantly with `flint init`.

---

## 📦 Installation

*(Once published to PyPI)*
```bash
pip install flint-core
```
Or install it directly from the source repository using Poetry:
```bash
poetry add git+[https://github.com/idperez720/data-engineering-exp.git](https://github.com/idperez720/data-engineering-exp.git)
```

---

## 🏁 Quick Start

### 1. Initialize your workspace

Navigate to an empty directory and let the interactive wizard scaffold the workspace conventions:

```bash
flint init

```

### 2. Declare a dataset

Add a specification block inside `conf/catalog/sample_dataset.yaml`:

```yaml
customers:
  description: "Main production customer data"
  format: "csv"
  engine: "pandas"
  storage_path: "data/sample_table.csv"

```

### 3. Load data anywhere

Create a Python script or open a Jupyter Notebook inside `src/notebooks/` and fetch your data instantly:

```python
from flint_core.core.io import DataLoader

# Autodiscovers your project root boundaries and settings
loader = DataLoader()

# Loads the dataset securely as a Pandas DataFrame
df = loader.load("customers")
df.head()

```

---

## 📖 Complete Documentation

For comprehensive guides, testing architecture deep-dives, and complete API references, visit our documentation site:
👉 **[http://127.0.0.1:8000/](https://www.google.com/search?q=http://127.0.0.1:8000/)** *(Replace with your deployed docs URL, e.g., GitHub Pages)*

---

## ⚖️ License

Distributed under the **MIT License**. Any modification or distribution (including forks) must include the original copyright notice and liability waiver. See `LICENSE` for more information.

```
