Metadata-Version: 2.4
Name: auto-swagger
Version: 0.1.1
Summary: Automatically generate Swagger/OpenAPI documentation for Express.js APIs using NLP and LLMs
Project-URL: Homepage, https://github.com/restartdk/auto_swagger
Project-URL: Repository, https://github.com/restartdk/auto_swagger
Project-URL: Issues, https://github.com/restartdk/auto_swagger/issues
Author: restartdk, paulopasso, albipuliga
License: MIT
License-File: LICENSE
Keywords: api-documentation,automation,documentation-generator,express,expressjs,llm,nlp,openapi,swagger
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Internet :: WWW/HTTP :: HTTP Servers
Classifier: Topic :: Software Development :: Documentation
Requires-Python: >=3.12
Requires-Dist: accelerate==1.6.0
Requires-Dist: annotated-types==0.7.0
Requires-Dist: bitsandbytes==0.42.0
Requires-Dist: blis==1.3.0
Requires-Dist: catalogue==2.0.10
Requires-Dist: certifi==2025.1.31
Requires-Dist: cffi==1.17.1
Requires-Dist: charset-normalizer==3.4.1
Requires-Dist: click==8.1.8
Requires-Dist: cloudpathlib==0.21.0
Requires-Dist: confection==0.1.5
Requires-Dist: cryptography==44.0.2
Requires-Dist: cymem==2.0.11
Requires-Dist: datasets>=2.14.4
Requires-Dist: deprecated==1.2.18
Requires-Dist: filelock==3.18.0
Requires-Dist: fsspec==2025.3.2
Requires-Dist: gitdb==4.0.12
Requires-Dist: gitpython==3.1.44
Requires-Dist: huggingface-hub>=0.34.0
Requires-Dist: idna==3.10
Requires-Dist: jinja2==3.1.6
Requires-Dist: langcodes==3.5.0
Requires-Dist: language-data==1.3.0
Requires-Dist: marisa-trie==1.2.1
Requires-Dist: markdown-it-py==3.0.0
Requires-Dist: markupsafe==3.0.2
Requires-Dist: mdurl==0.1.2
Requires-Dist: mpmath==1.3.0
Requires-Dist: murmurhash==1.0.12
Requires-Dist: networkx==3.4.2
Requires-Dist: numpy==2.2.4
Requires-Dist: packaging==25.0
Requires-Dist: peft>=0.15.2
Requires-Dist: pillow==11.2.1
Requires-Dist: preshed==3.0.9
Requires-Dist: psutil==7.0.0
Requires-Dist: pycparser==2.22
Requires-Dist: pydantic-core==2.33.1
Requires-Dist: pygithub==2.6.1
Requires-Dist: pygments==2.19.1
Requires-Dist: pyjwt==2.10.1
Requires-Dist: pynacl==1.5.0
Requires-Dist: pyyaml==6.0.2
Requires-Dist: regex==2024.11.6
Requires-Dist: requests==2.32.3
Requires-Dist: rich==14.0.0
Requires-Dist: safetensors==0.5.3
Requires-Dist: scipy==1.15.2
Requires-Dist: setuptools==78.1.0
Requires-Dist: shellingham==1.5.4
Requires-Dist: smart-open==7.1.0
Requires-Dist: smmap==5.0.2
Requires-Dist: spacy-legacy==3.0.12
Requires-Dist: spacy-loggers==1.0.5
Requires-Dist: spacy>=3.8.6
Requires-Dist: srsly==2.5.1
Requires-Dist: sympy==1.13.1
Requires-Dist: thinc==8.3.6
Requires-Dist: tokenizers==0.21.1
Requires-Dist: torch==2.6.0
Requires-Dist: torchvision==0.21.0
Requires-Dist: tqdm==4.67.1
Requires-Dist: transformers>=4.55.0
Requires-Dist: typer==0.15.2
Requires-Dist: typing-extensions==4.13.2
Requires-Dist: typing-inspection==0.4.0
Requires-Dist: urllib3==2.4.0
Requires-Dist: wasabi==1.1.3
Requires-Dist: weasel==0.4.1
Requires-Dist: wheel>=0.45.1
Requires-Dist: wrapt==1.17.2
Description-Content-Type: text/markdown

# Auto Swagger Documentation Generator

A sophisticated tool that automatically generates Swagger/OpenAPI documentation for Express.js APIs using advanced NLP techniques and LLMs.

## Overview

This project combines Natural Language Processing (NLP) techniques with Large Language Models (LLMs) to automatically generate high-quality API documentation. By preprocessing code with NLP before sending it to LLMs, we achieve:

- Better context understanding
- Reduced token usage
- More specific and higher quality responses
- Fine-grained control over the documentation pipeline
- Automated code base updates

## Features

- Automatic API route detection
- Intelligent parameter inference
- Response schema generation
- Validation rules detection
- Swagger/OpenAPI compliant output
- Support for Express.js routes
- Automated documentation insertion

## Architecture

![Architecture Diagram](attachments/image.png)

## Setup

1. Clone the repository:

```bash
git clone https://github.com/yourusername/auto_swagger.git
cd auto_swagger
```

2. Install UV if you haven't already:

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

3. Create a virtual environment and install dependencies:

```bash
uv venv
```

4. Run the auto-swagger tool:

```bash
# Run the main documentation generator
uv run auto-swagger --repo-path path/to/express/app

# Run the fine-tuning tool (if needed)
uv run finetune
```

## CLI Usage

### Basic Usage

```bash
# Generate documentation for the current directory
uv run auto-swagger

# Generate documentation for a specific repository
uv run auto-swagger --repo-path /path/to/express/app
```

### Command Line Arguments

- `--repo-path` (optional): Path to the repository root. Defaults to current working directory.

  ```bash
  uv run auto-swagger --repo-path /Users/username/projects/my-api
  ```

- `--branch` (optional): Branch to check for unmerged changes. Defaults to current branch.

  ```bash
  uv run auto-swagger --repo-path /path/to/repo --branch feature/new-endpoints
  ```

- `--model` (optional): Hugging Face model name to use for generation. Overrides the default model in config.

  ```bash
  # Use Google Gemma model
  uv run auto-swagger --repo-path /path/to/repo --model "google/gemma-2-2b-it"
  
  # Use DeepSeek Coder model
  uv run auto-swagger --repo-path /path/to/repo --model "deepseek-ai/deepseek-coder-1.3b-instruct"
  ```

- `--lora-adapter` (optional): LoRA adapter ID from Hugging Face. Use `none` to disable LoRA adapter and use base model only.

  ```bash
  # Use a custom LoRA adapter
  uv run auto-swagger --repo-path /path/to/repo --lora-adapter "username/my-adapter"
  
  # Disable LoRA adapter (use base model only)
  uv run auto-swagger --repo-path /path/to/repo --lora-adapter none
  ```

### Example Commands

```bash
# Full example with all options
uv run auto-swagger \
  --repo-path "/Users/username/projects/my-api" \
  --branch "main" \
  --model "google/gemma-2-2b-it" \
  --lora-adapter none

# Use default model but disable LoRA adapter
uv run auto-swagger \
  --repo-path "/path/to/repo" \
  --lora-adapter none

# Use a different model with custom LoRA adapter
uv run auto-swagger \
  --repo-path "/path/to/repo" \
  --model "google/gemma-2-2b-it" \
  --lora-adapter "username/custom-adapter"
```

## Project Structure

```text
auto_swagger/
├── src/
│   └── auto_swagger/         # Source code
│       ├── config/          # Configuration management
│       ├── finetune/        # Model fine-tuning utilities
│       ├── parser/          # Code parsing and analysis
│       └── swagger_generator/ # Documentation generation
├── data/                    # Project data
│   ├── jsdocs_finetune.jsonl # Fine-tuning dataset
│   └── swagger_docs/        # Generated documentation
├── pyproject.toml          # Project configuration
└── README.md
```

## Configuration

### Default Model Configuration

The project uses a config with default model settings:

```python
@dataclass
class LLMConfig:
    model_name: str = "deepseek-ai/deepseek-coder-1.3b-instruct"
    lora_adapter_id: Optional[str] = "paulopasso/auto-swagger"  # Default LoRA adapter
    max_new_tokens: int = 8192
    temperature: float = 0.2
    top_k: int = 50
    top_p: float = 0.95
    max_retries: int = 3
```

### Overriding Configuration via CLI

You can override the model and LoRA adapter settings using command-line arguments (see [CLI Usage](#cli-usage) above) without modifying the code:

```bash
# Use a different model
uv run auto-swagger --repo-path /path/to/repo --model "google/gemma-2-2b-it"

# Disable LoRA adapter
uv run auto-swagger --repo-path /path/to/repo --lora-adapter none

# Use custom model and adapter
uv run auto-swagger --repo-path /path/to/repo \
  --model "google/gemma-2-2b-it" \
  --lora-adapter "username/my-adapter"
```

### Advanced Configuration

For advanced configuration (temperature, top_k, top_p, etc.), you can modify `src/auto_swagger/swagger_generator/generator_config.py`.

## Future Improvements

- Support for additional backend frameworks beyond Express.js
- Local CLI version without GitHub app dependency
- Enhanced pattern recognition
- Additional documentation formats
- Real-time documentation updates
