Metadata-Version: 2.4
Name: featurecanvas
Version: 0.1.1
Summary: Python SDK for FeatureCanvas — the no-code feature engineering studio
Author-email: "G. Preetham Saxon" <gpreethamsaxon@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/GPREETHAMSAXON/FeatureCanvas
Project-URL: Documentation, https://github.com/GPREETHAMSAXON/FeatureCanvas#readme
Keywords: feature-engineering,machine-learning,data-science,featurecanvas
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: requests>=2.31.0

<div align="center">

# ⌁ FeatureCanvas

**No-code feature engineering studio with live impact scoring, leakage guardrails, and AI-powered suggestions.**

[![Live Demo](https://img.shields.io/badge/Live%20Demo-feature--canvas.vercel.app-orange?style=for-the-badge)](https://feature-canvas.vercel.app)
[![PyPI](https://img.shields.io/pypi/v/featurecanvas?style=for-the-badge&color=orange)](https://pypi.org/project/featurecanvas/)
[![License](https://img.shields.io/badge/license-MIT-orange?style=for-the-badge)](LICENSE)
[![Python](https://img.shields.io/badge/python-3.9%2B-orange?style=for-the-badge)](https://pypi.org/project/featurecanvas/)

[**Live Demo**](https://feature-canvas.vercel.app) · [**API Docs**](https://featurecanvas.onrender.com/docs) · [**SDK on PyPI**](https://pypi.org/project/featurecanvas/)

</div>

---

## What is FeatureCanvas?

FeatureCanvas is a visual, drag-and-drop feature engineering studio — think KNIME or RapidMiner, but with four things none of those tools offer out of the box:

| Feature | KNIME / RapidMiner | FeatureCanvas |
|---|---|---|
| Drag-and-drop transform canvas | ✅ | ✅ |
| Live predictive impact score per node | ❌ | ✅ |
| Leakage guardrails (rule-based) | ❌ | ✅ |
| AI Copilot with stat-grounded suggestions | ❌ | ✅ |
| No lock-in — plain pandas/sklearn export | ❌ | ✅ |
| Real DAG branching (siblings isolated) | Partial | ✅ |
| Path-scoped leakage (per branch) | ❌ | ✅ |

---

## Features

### 🎯 Live Predictive Impact Score
Every applied transform node shows mutual information and correlation against your target column in real time — directly on the node face, not buried in a separate view.

### 🛡️ Leakage Guardrails
Rule-based detection fires on the patterns that silently destroy model performance in production:
- **Fit-before-split** — scaling/encoding fitted on the full dataset before a train/test split
- **Groupby self-inclusion** — target encoding where each row can see its own label
- **Target-touched transforms** — derived features that are direct functions of the target
- **Target binned as feature** — the target column itself discretised into the feature set

All findings are **path-scoped** — a violation on Branch A never appears in Branch B's leakage panel.

### 🤖 AI Feature Copilot (Claude-powered)
Suggest transforms button → Claude profiles your dataset's actual column stats and returns ranked, explainable suggestions constrained to real executable transform keys. No hallucinated column names, no free-text the user has to translate.

### 🔓 No Lock-in Code Export
Every transform has a matching `codegen()` function. Select any node → click Code → get a standalone Python script with plain pandas/numpy/sklearn that runs anywhere with zero FeatureCanvas dependency.

### 🌿 Real DAG Branching
Two branches off the same parent are genuinely independent — their dataframes are resolved by walking the real ancestor chain, not by replaying a global ordered list. Sibling nodes never contaminate each other's column dropdowns, impact scores, or leakage findings.

### 💾 Session Persistence
Sessions survive backend restarts via **Upstash Redis** (DataFrames stored as Parquet bytes, node graphs as JSON). Canvas layout and sparklines restored from `localStorage` on page reload.

---

## Tech Stack

| Layer | Technology |
|---|---|
| Frontend | React 18 + Vite + react-flow + Tailwind |
| Backend | FastAPI + Python 3.11 |
| ML Engine | pandas, scikit-learn, scipy |
| AI | Anthropic Claude API |
| Session Store | Upstash Redis (Parquet + JSON) |
| Deploy | Vercel (frontend) + Render (backend) |
| SDK | Pure Python, `pip install featurecanvas` |

---

## Transforms (21)

### Numeric
`log` · `sqrt` · `standard_scale` · `minmax_scale` · `robust_scale` · `power_transform` · `clip_outliers` · `abs_value` · `binning` · `column_ratio` · `column_diff`

### Categorical
`onehot` · `label_encode` · `frequency_encode` · `rare_group`

### Datetime
`datetime_decompose`

### Relational
`groupby_agg`

### Cleaning
`fillna` · `drop_column` · `rename_column`

---

## Python SDK

```bash
pip install featurecanvas
```

```python
from featurecanvas import FeatureCanvas

fc = FeatureCanvas("https://featurecanvas.onrender.com")

# Upload a CSV and set the target
session = fc.upload("train.csv")
session.set_target("churned")

# Build a pipeline
log_node = session.apply("log", column="monthly_income")
scaled = log_node.apply("standard_scale", column="monthly_income_log")

# Branch off an earlier node
sqrt_node = session.apply("sqrt", column="age")  # sibling of log_node

# Inspect
print(scaled.columns())         # ['monthly_income_log_scaled', ...]
print(scaled.leakage())         # leakage findings scoped to this branch
print(scaled.scores())          # MI scores vs target

# Check for leakage before shipping
risky = session.apply("groupby_agg",
    group_column="city", agg_column="churned", agg_func="mean")
if risky.has_leakage("high"):
    print("HIGH leakage detected:", risky.leakage())

# Export clean Python — no FeatureCanvas dependency
print(scaled.code())
```

---

## Local Development

### Prerequisites
- Python 3.11+
- Node.js 18+
- Upstash Redis account (free tier works)
- Anthropic API key (for Copilot)

### Backend

```bash
cd backend
python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # Mac/Linux
pip install -r requirements.txt
```

Create `backend/.env`:
```
ANTHROPIC_API_KEY=sk-ant-...
UPSTASH_REDIS_URL=rediss://default:...@....upstash.io:6379
```

```bash
uvicorn app.main:app --reload --port 8000
```

### Frontend

```bash
cd frontend
npm install
npm run dev
```

Open `http://localhost:5173`

### Tests

```bash
# Backend (44 tests)
cd backend && python -m pytest tests/ -v

# Frontend integration (requires backend running)
cd frontend && npx vitest run --testTimeout=20000
```

---

## Deployment

| Service | Config file | Notes |
|---|---|---|
| Render (backend) | `backend/render.yaml` | Set `ANTHROPIC_API_KEY`, `UPSTASH_REDIS_URL`, `FRONTEND_URL` |
| Vercel (frontend) | `frontend/vercel.json` | Set `VITE_API_BASE_URL=https://your-render-url/api` |

---

## Project Structure

```
featurecanvas/
├── backend/
│   ├── app/
│   │   ├── engine/          # Core ML engine
│   │   │   ├── transforms.py    # 21 transforms with codegen
│   │   │   ├── leakage.py       # Rule-based leakage detection
│   │   │   ├── impact_scoring.py # MI / correlation scoring
│   │   │   ├── graph.py         # DAG resolution engine
│   │   │   ├── session_store.py # Redis-backed persistence
│   │   │   ├── copilot.py       # Claude AI suggestions
│   │   │   ├── codegen.py       # Python script export
│   │   │   └── profiling.py     # Column statistics
│   │   ├── main.py          # FastAPI endpoints
│   │   └── schemas.py       # Pydantic models
│   ├── tests/               # 44 backend tests
│   ├── requirements.txt
│   └── render.yaml
├── frontend/
│   ├── src/
│   │   ├── App.jsx          # Main canvas + state
│   │   ├── nodes/           # SourceNode, TransformNode, TargetNode
│   │   ├── components/      # Sidebar, panels, LeakagePanel, etc.
│   │   └── api/client.js    # Axios API client
│   └── vercel.json
├── sdk/                     # featurecanvas PyPI package
│   ├── featurecanvas/
│   │   ├── __init__.py
│   │   └── client.py
│   └── pyproject.toml
└── featurecanvas_test_data.csv
```

---

## Built by

**G. Preetham Saxon** — B.Tech CSE @ VIIT Visakhapatnam · IEEE Student Branch Vice Chairperson · AI Product Engineer

[![GitHub](https://img.shields.io/badge/GitHub-GPREETHAMSAXON-black?style=flat-square&logo=github)](https://github.com/GPREETHAMSAXON)
[![LinkedIn](https://img.shields.io/badge/LinkedIn-Connect-blue?style=flat-square&logo=linkedin)](https://linkedin.com/in/gpreethamsaxon)

---

## License

MIT © G. Preetham Saxon
