Metadata-Version: 2.4
Name: ridegenie
Version: 0.1.3
Summary: One-line installer for the India RideGenie Databricks demo (data + Genie + AI/BI dashboard + app).
Author: Anuj Lathi
License-Expression: MIT
Project-URL: Homepage, https://github.com/anuj1303/ridegenie-demo
Keywords: databricks,genie,ai-bi,geospatial,demo,dbdemos
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: databricks-sdk>=0.33.0
Dynamic: license-file

# 🛺 ridegenie — India RideGenie demo, installed in one line

[![PyPI](https://img.shields.io/pypi/v/ridegenie.svg)](https://pypi.org/project/ridegenie/)
[![Python](https://img.shields.io/pypi/pyversions/ridegenie.svg)](https://pypi.org/project/ridegenie/)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)

A [`dbdemos`](https://github.com/databricks-demos/dbdemos)-style installer for the **India RideGenie**
demo: natural-language **geospatial analytics** on synthetic Indian ride-hailing data —
**200K trips · 160+ cities · ₹ fares · UPI payments**.

One call provisions the whole thing into *your* Databricks workspace:

| # | Asset | What it does |
|---|-------|--------------|
| 1 | **Unity Catalog data layer** | `rides_enriched`, `ride_zones`, `pickup_hotspots` (~200K rides, PIN-code geo) |
| 2 | **Genie Space** | Ask in plain English — Genie writes governed SQL (₹, cities, surge, payments) |
| 3 | **AI/BI Dashboard** | INR KPIs, demand-by-hour, surge, vehicle/payment mix + a live pickup-hotspot map |
| 4 | **RideGenie App** | FastAPI + Leaflet — **draw a region** on an India map and ask Genie about it |

---

## Quick start (Databricks notebook)

```python
%pip install ridegenie
dbutils.library.restartPython()

import ridegenie
ridegenie.install("india-ridegenie")
```

> Want the bleeding edge (unreleased `main`)? `%pip install git+https://github.com/anuj1303/ridegenie-demo.git`

> **Not on Free Edition?** The default catalog `workspace` only exists on Free Edition. On any other
> workspace, pass your own catalog: `ridegenie.install("india-ridegenie", catalog="your_catalog")`.
> (If you skip this, the installer fails fast and lists the catalogs available to you.)

That's it. The installer prints links to your Genie Space, Dashboard, and App when it finishes.

> **Prerequisite:** a running **SQL Warehouse** (the *Serverless Starter Warehouse* is enough).
> Works on Databricks **Free Edition** and any AWS/Azure/GCP workspace.

Prefer not to write code? Import [`notebooks/00-START-HERE-RideGenie.py`](notebooks/00-START-HERE-RideGenie.py)
into your workspace and **Run All**.

---

## What `install()` does (à la dbdemos)

1. **Resolves context** — current user, workspace host, and a SQL warehouse (auto-picks a running one, or starts/creates one).
2. **Builds the data layer** — creates the schema + volume, uploads the bundled (gzipped) CSVs, and runs a templated `setup.sql` to build the tables/views.
3. **Creates the Genie Space** — registers the tables with format-assistance + entity-matching and applies domain instructions (INR, surge, payments).
4. **Creates & publishes the AI/BI dashboard** — one shared dataset drives cross-filtering across KPIs, charts, and the hotspot map.
5. **Deploys the app** — creates the app, uploads the source, attaches `sql-warehouse` + `genie` resources (IDs auto-injected), grants the app's service principal data access, and deploys.

Everything is **workspace-portable** — table names, warehouse, Genie/app resource IDs are resolved at install time via `{{CATALOG}}` / `{{SCHEMA}}` / `{{VOLUME}}` templating. No hard-coded IDs.

---

## Options

```python
ridegenie.install(
    "india-ridegenie",
    catalog="workspace",              # target UC catalog
    schema="india_rides_workshop",    # target UC schema
    warehouse_id=None,                # default: first running/available warehouse
    install_app=True,                 # set False to skip the FastAPI app
    profile=None,                     # only when running locally (Databricks CLI profile)
    overwrite=False,
)
```

Run **locally** (outside a notebook) against a CLI profile:

```python
import ridegenie
ridegenie.install("india-ridegenie", profile="FreeEdition")
```

---

## Package layout

```
ridegenie/
├── conf/india-ridegenie.json     # demo manifest (the "bundle_config")
├── installer.py                  # the Installer orchestrator + public install()
├── sql/setup.sql                 # templated data-layer DDL
├── genie/instructions.md         # Genie domain instructions
├── dashboards/india_ridegenie.lvdash.json   # AI/BI dashboard definition
├── data/*.csv.gz                 # pre-built synthetic data (uploaded to a Volume)
└── app/                          # FastAPI + Leaflet app source
notebooks/00-START-HERE-RideGenie.py   # one-click bootstrap notebook
scripts/                          # build_zones.py / generate_rides.py (regenerate the data)
```

## Regenerating the data (optional)

The repo ships pre-built data. To rebuild it from source (GeoNames India PIN codes):

```bash
cd scripts
python3 build_zones.py       # downloads GeoNames, builds ride_zones.csv
python3 generate_rides.py    # builds rides.csv (200K trips)
gzip -c ride_zones.csv > ../ridegenie/data/ride_zones.csv.gz
gzip -c rides.csv      > ../ridegenie/data/rides.csv.gz
```

---

*Built for Databricks Field Engineering. Synthetic data — no PII. PIN codes from [GeoNames](https://www.geonames.org/) (India).*
