Metadata-Version: 2.4
Name: otat
Version: 0.1.1
Summary: Vision-Language Model Interpretability Analysis - One Token at a Time
Author-email: Your Name <your.email@example.com>
License: MIT
Project-URL: Homepage, https://github.com/varungupta31/otat_api
Project-URL: Repository, https://github.com/varungupta31/otat_api
Project-URL: Documentation, https://github.com/varungupta31/otat_api#readme
Keywords: interpretability,vision-language-models,attention,llava,qwen
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: torch>=2.0.0
Requires-Dist: torchvision
Requires-Dist: transformers==4.57.0
Requires-Dist: accelerate
Requires-Dist: pandas
Requires-Dist: numpy
Requires-Dist: pillow
Requires-Dist: pyyaml
Requires-Dist: tqdm
Requires-Dist: qwen_vl_utils
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: isort>=5.12.0; extra == "dev"
Requires-Dist: flake8>=6.0.0; extra == "dev"
Provides-Extra: api
Requires-Dist: fastapi>=0.104.0; extra == "api"
Requires-Dist: uvicorn>=0.24.0; extra == "api"
Dynamic: requires-python

# Interpretability Core

Vision-Language Model Interpretability Analysis toolkit for analyzing attention patterns in models like LLaVA and Qwen-VL.

## Installation

### From PyPI (when published)
```bash
pip install otat
```

### From GitHub
```bash
pip install git@github.com:varungupta31/otat_api.git
```

### Local Development
```bash
git clone https://github.com/varungupta31/otat_api.git
cd otat_api
pip install -e .
```

## Quick Start
```python
from interpretability.api.wrapper import InterpretabilityAnalyzer

# Initialize analyzer
analyzer = InterpretabilityAnalyzer(
    model_type="llava_onevision",
    model_id="llava-hf/llava-onevision-qwen2-0.5b-ov-hf"
)

# Run analysis
result = analyzer.analyze(
    image_path="path/to/image.jpg",
    task_text="What is in this image?",
    instruction="Answer briefly.",
    blocking_mode="none",
    num_tokens=20
)

print(result['output_tokens'])
print(result['series'])  # Attention patterns
```

## Features

- 🔍 Attention pattern analysis for VLMs
- 🎯 Support for LLaVA, Qwen-VL, and Qwen2-LLM
- 🚫 Attention blocking experiments
- 📊 Token-level attention aggregation

## License

MIT


## **3. Verify Your Project Structure**

Make sure it looks like this:


```
otat_api/  (or interpretability-core/)
├── pyproject.toml          # ✅ Just created
├── README.md               # ✅ Just created
├── LICENSE                 # Optional but recommended
├── .gitignore             # Recommended
├── src/
│   └── interpretability/
│       ├── __init__.py    # ⚠️ Make sure this exists!
│       ├── api/
│       │   ├── __init__.py
│       │   └── wrapper.py
│       ├── models/
│       │   ├── __init__.py
│       │   ├── base_handler.py
│       │   ├── llava_onevision.py
│       │   ├── qwen2_llm.py
│       │   └── qwen_25_vl.py
│       ├── tasks/
│       │   ├── __init__.py
│       │   ├── base_task.py
│       │   ├── web_api_task.py
│       │   └── ...
│       ├── analysis/
│       │   ├── __init__.py
│       │   ├── extractor_aggregated.py
│       │   └── ...
│       ├── utils/
│       │   ├── __init__.py
│       │   └── ...
│       └── prompts/
│           └── ...
└── scripts/
    └── run_analysis.py
```
