Metadata-Version: 2.4
Name: vuln-scanner-ai
Version: 2.0.0
Summary: AI-powered vulnerability scanner supporting 19 vulnerability categories across 15+ languages
License: MIT
Project-URL: Homepage, https://github.com/yourusername/vulnerability-scanner
Project-URL: Issues, https://github.com/yourusername/vulnerability-scanner/issues
Keywords: security,vulnerability,scanner,sast,ai,code-review,static-analysis,appsec
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: openai>=1.0.0
Provides-Extra: dev
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: license-file

# AI-Powered Vulnerability Scanner

An intelligent security analysis tool that combines **pattern-based detection** with **AI-powered deep code review** to identify vulnerabilities across multiple programming languages.

## Features

- **Pattern-Based Detection**: Fast, pre-compiled regex scanning across 19 vulnerability categories
  - SQL Injection
  - Cross-Site Scripting (XSS)
  - Hardcoded Credentials (passwords, API keys, AWS/GitHub tokens, private keys)
  - Command Injection (including `shell=True`, `child_process.exec`)
  - Path Traversal
  - Weak Cryptography (MD5/SHA1, DES/RC4/ECB, disabled TLS)
  - Insecure Random Number Generation
  - Insecure Deserialization (pickle, yaml.load, PHP unserialize, Java ObjectInputStream)
  - Server-Side Request Forgery (SSRF)
  - Open Redirect
  - XML External Entity (XXE)
  - LDAP Injection
  - Template Injection / SSTI (Jinja2, EJS, Pug)
  - Prototype Pollution (JavaScript/TypeScript)
  - Regular Expression Denial of Service (ReDoS)
  - Sensitive Data Exposure
  - Insecure HTTP Headers (wildcard CORS, clickjacking)
  - Race Conditions / TOCTOU
  - Code Quality Issues (bare `except`, `eval()`, debug mode, `assert` for auth, security TODOs)

- **AI-Powered Code Review**: Deep analysis using LLMs for issues beyond pattern matching
  - Authentication & authorization flaws
  - Mass assignment / Insecure Direct Object Reference (IDOR)
  - Business logic vulnerabilities
  - Per-file risk score (`safe` / `risk` / `critical`) and one-line AI summary
  - Actionable remediation recommendations with code examples

- **Multi-Language Support**: Python, JavaScript, TypeScript (JSX/TSX), Java, PHP, Go, Rust, Ruby, C/C++, C#, Kotlin, Swift, Scala, Shell

- **Optimized Engine**:
  - Regex patterns compiled once at startup for fast repeated scanning
  - Comment-line skipping to reduce false positives
  - Built-in deduplication of findings
  - 2 MB file-size guard to skip minified/generated files
  - Configurable minimum-severity filter

- **Flexible Output**: Console, JSON, or save to file

## Installation

### Option 1: PyPI (recommended — plug & play)

```bash
pip install vuln-scanner-ai
```

That's it. The `vulnscan` CLI command is immediately available:

```bash
vulnscan scan ./myproject
```

### Option 2: Clone from source

```bash
git clone https://github.com/yourusername/vulnerability-scanner.git
cd vulnerability-scanner
pip install -e .
```

### Setting Up OpenAI API Key (for AI Features)

AI analysis is **optional**. Pattern-based scanning works with no API key.

**Linux / macOS:**
```bash
export OPENAI_API_KEY=sk-your-actual-api-key-here
```

**Windows (PowerShell):**
```powershell
$env:OPENAI_API_KEY="sk-your-actual-api-key-here"
```

**Windows (Command Prompt):**
```cmd
set OPENAI_API_KEY=sk-your-actual-api-key-here
```

**Or via `.env` file:**
```bash
cp .env.example .env
# Edit .env and set OPENAI_API_KEY=your_key
```

## GitHub Action Usage

Add a security scan to **any repository** with a single step — no setup required:

```yaml
# .github/workflows/security.yml
name: Security Scan

on: [push, pull_request]

jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run Vulnerability Scanner
        uses: yourusername/vulnerability-scanner@v2
        with:
          path: '.'
          min-severity: 'MEDIUM'       # only report MEDIUM and above
          fail-on-severity: 'HIGH'     # fail the build on HIGH/CRITICAL findings
```

### With AI-powered analysis:

```yaml
      - name: Run AI Vulnerability Scanner
        uses: yourusername/vulnerability-scanner@v2
        with:
          path: '.'
          use-ai: 'true'
          openai-api-key: ${{ secrets.OPENAI_API_KEY }}
          ai-model: 'gpt-4'
          min-severity: 'LOW'
          output-format: 'json'
          output-file: 'security-report.json'
          fail-on-severity: 'CRITICAL'
```

### Action Inputs

| Input | Default | Description |
|-------|---------|-------------|
| `path` | `.` | File or directory to scan |
| `min-severity` | `LOW` | Minimum severity to report: `LOW` \| `MEDIUM` \| `HIGH` \| `CRITICAL` |
| `use-ai` | `false` | Enable AI-powered deep analysis |
| `openai-api-key` | — | OpenAI API key (use GitHub Secrets) |
| `ai-model` | `gpt-4` | Model to use for AI analysis |
| `output-format` | `text` | `text` or `json` |
| `output-file` | — | Save report to this path (also uploaded as artifact) |
| `fail-on-severity` | `HIGH` | Fail the step if this severity is found (`none` to disable) |

### Action Outputs

| Output | Description |
|--------|-------------|
| `total-vulnerabilities` | Total number of findings |
| `report-path` | Path of the saved report file |

## Usage

After `pip install vuln-scanner-ai`, the `vulnscan` command is available globally.
When running from source, replace `vulnscan` with `python main.py`.

### Basic Pattern-Based Scan

```bash
vulnscan scan ./myproject
vulnscan scan ./myproject/app.py       # single file
```

### AI-Powered Deep Scan

```bash
vulnscan scan ./myproject --ai
vulnscan scan ./myproject/app.py --ai
```

### Advanced Options

```bash
# Filter output — only MEDIUM and above
vulnscan scan ./myproject --min-severity MEDIUM

# Specific AI model
vulnscan scan ./myproject --ai --model gpt-3.5-turbo

# JSON output
vulnscan scan ./myproject --ai --json

# Save report to file
vulnscan scan ./myproject --ai --output report.txt

# Full combination
vulnscan scan ./myproject --ai --model gpt-4 --min-severity MEDIUM --json --output report.json
```

### All CLI Flags

| Flag | Default | Description |
|------|---------|-------------|
| `path` | *(required)* | File or directory to scan |
| `--ai` | off | Enable AI-powered analysis |
| `--model` | `gpt-4` | OpenAI model to use |
| `--min-severity` | `LOW` | Filter: `LOW` \| `MEDIUM` \| `HIGH` \| `CRITICAL` |
| `--json` | off | Output as JSON instead of text |
| `--output` / `-o` | *(stdout)* | Save report to this file |

## Output Examples

### Pattern-Based Scan Output
```
================================================================================
VULNERABILITY SCAN REPORT
================================================================================
Total vulnerabilities found: 5

Summary by Severity:
  CRITICAL: 2
  HIGH: 2
  MEDIUM: 1

Summary by Vulnerability Type:
  Hardcoded Credentials: 2
  SQL Injection: 1
  Command Injection: 1
  Weak Cryptography: 1

File: ./app.py
--------------------------------------------------------------------------------
  Line 12: [CRITICAL] Hardcoded Credentials
  Description: Hardcoded credentials or secret detected
  Code: api_key = "sk-abc123secretvalue"
  Recommendation: Store secrets in environment variables or a secrets manager.

  Line 45: [HIGH] SQL Injection
  Description: Potential SQL injection via string concatenation
  Code: cursor.execute("SELECT * FROM users WHERE id = " + user_id)
  Recommendation: Use parameterized queries: cursor.execute("SELECT ... WHERE id=%s", (user_id,))
```

### AI-Powered Scan Output
```
================================================================================
AI-POWERED VULNERABILITY SCAN REPORT
================================================================================

Scan Summary:
  Total files scanned: 15
  Files with vulnerabilities: 4
  Total vulnerabilities found: 9

Severity Breakdown:
  CRITICAL: 3
  HIGH: 4
  MEDIUM: 2

================================================================================
DETAILED FINDINGS
================================================================================

[FILE] [CRITICAL] ./app.py
  AI Assessment: File contains multiple critical injection vulnerabilities requiring immediate remediation.
--------------------------------------------------------------------------------

  [CRITICAL] SQL Injection
  Line(s): [45]
  Description: User input directly concatenated into SQL query
  Recommendation: Use parameterized queries: cursor.execute("SELECT * FROM t WHERE id=%s", (id,))
  Code: query = "SELECT * FROM users WHERE id = " + user_id

  [HIGH] Server-Side Request Forgery (SSRF)
  Line(s): [78]
  Description: HTTP request built from user-controlled input
  Recommendation: Validate and whitelist allowed URLs/hostnames before making outbound requests.
```

## Programmatic Usage

You can also use the scanner as a Python library:

```python
from vuln_scanner import VulnerabilityScanner
from ai_agent import VulnerabilityAI

# Pattern-based scanning
scanner = VulnerabilityScanner()
vulnerabilities = scanner.scan_directory("./myproject")
report = scanner.generate_report(vulnerabilities)
print(report)

# AI-powered scanning
ai_agent = VulnerabilityAI(model="gpt-4")
results = ai_agent.deep_scan_directory("./myproject", use_ai=True)
ai_report = ai_agent.generate_ai_report(results)
print(ai_report)
```

## Vulnerability Types Detected

| # | Type | Severity | Languages | Description |
|---|------|----------|-----------|-------------|
| 1 | SQL Injection | HIGH / CRITICAL | Python, Java, JS, PHP | String concatenation or f-string interpolation in SQL queries |
| 2 | Cross-Site Scripting (XSS) | HIGH | JS, TS, PHP | `innerHTML`, `document.write`, `dangerouslySetInnerHTML` with user input |
| 3 | Hardcoded Credentials | CRITICAL | All | Passwords, API keys, AWS keys, GitHub tokens, private keys in source |
| 4 | Command Injection | HIGH / CRITICAL | Python, PHP, Ruby, Java, JS | `shell=True`, `exec/system/popen`, `child_process.exec` with user data |
| 5 | Path Traversal | MEDIUM / HIGH | Python, PHP, Ruby, JS, Java | User-controlled file paths, `../` sequences, unsafe `send_file` |
| 6 | Weak Cryptography | MEDIUM / HIGH | Most | MD5/SHA1/CRC32, DES/RC4/ECB/3DES, disabled TLS certificate verification |
| 7 | Insecure Random | MEDIUM | Python, JS, PHP | `random()` / `Math.random()` used for security-sensitive tokens |
| 8 | Insecure Deserialization | CRITICAL | Python, PHP, Java, Ruby | `pickle.loads`, `yaml.load`, PHP `unserialize`, Java `ObjectInputStream` |
| 9 | SSRF | HIGH | Python, JS, PHP | HTTP requests constructed from user-controlled URLs |
| 10 | Open Redirect | MEDIUM | Python, PHP, Ruby, JS | `redirect()` with unvalidated user-supplied URL |
| 11 | XXE | HIGH | Python, Java, PHP | XML parsers without external entity protection |
| 12 | LDAP Injection | HIGH | Python, PHP | `ldap_search`/`ldap_bind` with unsanitized user input |
| 13 | Template Injection (SSTI) | CRITICAL | Python, JS | `render_template_string`, `ejs.render`, `pug` with user data |
| 14 | Prototype Pollution | HIGH | JS, TS | `__proto__`, `Object.assign` misuse with untrusted input |
| 15 | ReDoS | MEDIUM | Python, JS | Catastrophically backtracking regex patterns |
| 16 | Sensitive Data Exposure | HIGH / CRITICAL | Most | Logging secrets, AWS access keys, GitHub PATs committed to source |
| 17 | Insecure HTTP Headers | MEDIUM | Most | Wildcard CORS (`*`), `X-Frame-Options: ALLOW` |
| 18 | Race Condition | LOW / MEDIUM | Python | TOCTOU file checks, unguarded shared threading state |
| 19 | Code Quality Issues | LOW – HIGH | Most | Bare `except`, `eval()`, debug mode on, `assert` for auth, security TODOs |

## Configuration

Edit `config.py` to customize:

- Excluded directories
- File extensions to scan
- Maximum file size
- Default AI model
- Severity threshold

## Requirements

- Python 3.7+
- openai>=1.0.0 (for AI features)

## Limitations

- Pattern-based detection may produce false positives
- AI analysis requires API key and incurs costs
- Large codebases may take significant time with AI analysis
- Not a replacement for professional security audits

## Security Best Practices

This tool helps identify potential vulnerabilities but should be used as part of a broader security strategy:

1. Regular security audits
2. Dependency scanning (e.g., `npm audit`, `pip-audit`)
3. Static Application Security Testing (SAST)
4. Dynamic Application Security Testing (DAST)
5. Penetration testing

## License

This tool is provided as-is for educational and development purposes.

## Deployment to GitHub

### For Developers: Publishing Your Own Fork

1. **Create a GitHub Repository**
   - Go to [https://github.com/new](https://github.com/new)
   - Repository name: `vulnerability-scanner` (or your preferred name)
   - Description: AI-Powered Vulnerability Scanner for Code
   - Make it Public (recommended) or Private
   - Don't initialize with README (we already have one)
   - Click "Create repository"

2. **Push Your Code to GitHub**
   ```bash
   # Navigate to your project directory
   cd c:/Git/_Removed/vul

   # Add all files to git
   git add .

   # Create initial commit
   git commit -m "Initial commit: AI-Powered Vulnerability Scanner"

   # Add remote repository (replace with your username)
   git remote add origin https://github.com/yourusername/vulnerability-scanner.git

   # Push to GitHub
   git branch -M main
   git push -u origin main
   ```

3. **Verify on GitHub**
   - Visit your repository URL
   - Confirm all files are uploaded
   - The README.md should be displayed on the repository page

### Setting Up GitHub Pages (Optional)

To create a website for your project:

1. Go to your repository on GitHub
2. Click "Settings" → "Pages"
3. Under "Source", select "Deploy from a branch"
4. Choose "main" branch and "/ (root)" folder
5. Click "Save"
6. Your site will be available at `https://yourusername.github.io/vulnerability-scanner/`

### Adding a License

To add a license to your repository:

1. Go to your repository on GitHub
2. Click "Add file" → "Create new file"
3. Name the file `LICENSE`
4. Choose a license (MIT is recommended for open source)
5. Click "Review changes" and "Commit changes"

Or via command line:
```bash
# Create LICENSE file
echo "MIT License

Copyright (c) 2026 [Your Name]

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE." > LICENSE

git add LICENSE
git commit -m "Add MIT license"
git push
```

### Adding Topics/Tags

To help others discover your repository:

1. Go to your repository on GitHub
2. Click the gear icon (Settings) → "About" section
3. Add topics: `security`, `vulnerability-scanner`, `python`, `ai`, `static-analysis`, `sast`
4. Add a website URL if you have one
5. Click "Save changes"

### Creating Releases

To create a release version:

```bash
# Tag the current version
git tag -a v1.0.0 -m "Initial release - AI-Powered Vulnerability Scanner"

# Push the tag
git push origin v1.0.0
```

Then on GitHub:
1. Go to "Releases" → "Create a new release"
2. Choose the tag `v1.0.0`
3. Add release notes
4. Click "Publish release"

## Contributing

Contributions are welcome! Areas for improvement:

- Additional vulnerability patterns
- Support for more languages
- Enhanced AI prompts
- Integration with other security tools
- False positive reduction

### How to Contribute

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## Quick Start Checklist

- [ ] Install Python 3.8+
- [ ] `pip install vuln-scanner-ai`
- [ ] Run a pattern-only scan: `vulnscan scan .`
- [ ] Set `OPENAI_API_KEY` env var (for AI features)
- [ ] Run an AI-powered scan: `vulnscan scan . --ai`
- [ ] Filter by severity: `vulnscan scan . --min-severity HIGH`
- [ ] Save a report: `vulnscan scan . --json --output report.json`
- [ ] Add to CI: copy the GitHub Action snippet above into `.github/workflows/security.yml`
