Metadata-Version: 2.4
Name: canopy-scanner
Version: 0.1.1
Summary: An OSINT tool for intelligence gathering and username enumeration
Author-email: Guy Volvoshin <guyvoloshin@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/guyvolvo/canopy
Project-URL: Repository, https://github.com/guyvolvo/canopy.git
Project-URL: Issues, https://github.com/guyvolvo/canopy/issues
Keywords: osint,intelligence,reconnaissance,username,enumeration
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Information Technology
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Classifier: Topic :: Security
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.32.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Dynamic: license-file

Canopy is a Python-based OSINT (Open-Source Intelligence) framework designed to help individuals discover, organize, and analyze publicly available information about their own digital footprint. \
Note: Canopy was developed and tested exclusively on my own publicly available data as a learning and portfolio project.

# Project Goals

- Understand how publicly available information is indexed and exposed online
- Practice structured OSINT methodology using search engines
- Correlate results from multiple sources into meaningful categories
- Demonstrate ethical boundaries and legal awareness in OSINT work

## Features

- Multi-platform username enumeration across social media, coding sites, gaming platforms, and more.
- Avoids false positives using fingerprint-based validation.
- Supports local caching of platform fingerprints to speed up scans.
- Multi-threaded, high-performance scanning with optional rate-limiting and delays.
- Generates reports in JSON, CSV, HTML, or TXT formats.
- CLI interface for easy integration into scripts or automation workflows.
- Categorized results for better organization (e.g., social, professional, gaming).

```bash
usage: canopy.py [-h] [-u USERNAME] [-U USERNAMES] [-t THREADS] [--timeout TIMEOUT] [--delay DELAY]
                 [--rate-limit RATE_LIMIT] [-c CATEGORIES] [-p PLATFORMS] [--exclude EXCLUDE] [--only-found]
                 [--list-categories] [-o OUTPUT] [-f {json,csv,html,txt}] [-v] [-q] [--print-found]

Canopy - Username Enumeration Tool

options:
  -h, --help            show this help message and exit

Target Options:
  -u, --username USERNAME
                        Username to search for
  -U, --usernames USERNAMES
                        File containing list of usernames (one per line)

Performance Options:
  -t, --threads THREADS
                        Number of concurrent threads (default: 10)
  --timeout TIMEOUT     Request timeout in seconds (default: 10)
  --delay DELAY         Delay between requests in seconds (default: 0)
  --rate-limit RATE_LIMIT
                        Max requests per second (default: unlimited)

Filtering Options:
  -c, --categories CATEGORIES
                        Comma-separated categories to check (e.g., social,gaming)
  -p, --platforms PLATFORMS
                        Comma-separated specific platforms to check
  --exclude EXCLUDE     Comma-separated platforms to exclude
  --only-found          Only show found accounts
  --list-categories     Show all available platform categories and exit

Output Options:
  -o, --output OUTPUT   Output file path
  -f, --format {json,csv,html,txt}
                        Output format: json, csv, html, txt (default: json)
  -v, --verbose         Verbose output
  -q, --quiet           Minimal output (only results)
  --print-found         Print found accounts in real-time

    Examples:
      canopy -u johndoe
      canopy -u johndoe -t 50 --timeout 15
      canopy -u johndoe -o report.json --format json
      canopy -u johndoe --categories social,gaming
      canopy --list-categories
```

## Installation
Clone the repository:

```bash
git clone https://github.com/guyvolvo/Canopy.git
cd Canopy
```

### _Legal & Ethical Disclaimer_

##### This framework is intended strictly for self-OSINT, educational use, or explicit consent-based research.
##### This tool should be used to analyze:
- Your own digital footprint
- Accounts, domains, and identifiers you own
- Targets for which you have explicit written permission
- Using this tool against private individuals without consent may violate privacy laws and platform Terms of Service.
- I have no responsibility for misuse of this software.

_Canopy collects metadata only, such as:_
- Page titles
- URLs
- Search snippets
- Source domain

_It does not:_
- Bypass CAPTCHAs
- Scrape authenticated content
- Harvest private data
- Enumerate personal contact lists

#### Theoretical Project Structure (Generated by ChatGPT and Cluade Made for reference so I can follow along and add or remove things as I see fit):

GPT Workflow :

canopy/\
├── README.md\
├── DISCLAIMER.md\
├── methodology/\
│   └── osint_methodology.md\
├── canopy/\
│   ├── query_generator.py\
│   ├── collector.py\
│   ├── parser.py\
│   └── correlator.py\
├── output/\
│   └── sample_report.md\
└── lessons_learned.md\

Claude workflow :

Canopy/\
├── main.py                    # Entry point, CLI interface\
├── platforms.json             # Platform database\
├── query_generator.py         # Generate queries from usernames\
├── username_checker.py        # Check if username exists on platforms\
├── data_collector.py          # Collect and aggregate data\
├── report_generator.py        # Format and export results\
├── config.py                  # Configuration settings\
├── utils.py                   # Helper functions\
└── requirements.txt           # Dependencies\

_platforms.json inspired by the Sherlock OSINT project :)_

## Methodology

- Canopy uses a structured OSINT approach:
- Generate a list of platforms to query (social, professional, gaming).
- Create URL patterns for a target username.
- Validate account existence using HTTP responses, redirects, error messages, and HTML fingerprints.
- Aggregate results into structured reports.
- Optionally store fingerprints locally to avoid redundant requests.
- This ensures high accuracy while reducing false positives.

## Best Practices

- Only scan accounts you own or have explicit permission to analyze.
- Use --threads and --rate-limit responsibly to avoid being blocked by platforms.
- Review your JSON/CSV/HTML reports for patterns before taking any action.
- Update platforms.json regularly to include new platforms.
- Periodically refresh fingerprints for platforms that change their 404 pages.
