Metadata-Version: 2.4
Name: diskovery
Version: 0.1.1
Summary: DISKOVERY: Disk Forensics Tool for Data Categorization & Keyword Filtering
Home-page: https://github.com/simmithapad/DISKOVERY
Author: Simmi Thapad, Vrinda Abrol
License: MIT
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: acres==0.3.0
Requires-Dist: certifi==2025.1.31
Requires-Dist: chardet==5.2.0
Requires-Dist: charset-normalizer==3.4.1
Requires-Dist: ci-info==0.3.0
Requires-Dist: click==8.1.8
Requires-Dist: configobj==5.0.9
Requires-Dist: configparser==7.2.0
Requires-Dist: contourpy==1.3.2
Requires-Dist: cycler==0.12.1
Requires-Dist: defusedxml==0.7.1
Requires-Dist: docx==0.2.4
Requires-Dist: docx2txt==0.9
Requires-Dist: etelemetry==0.3.1
Requires-Dist: filelock==3.18.0
Requires-Dist: fls==0.0.1.241224
Requires-Dist: fonttools==4.57.0
Requires-Dist: fpdf==1.7.2
Requires-Dist: httplib2==0.22.0
Requires-Dist: idna==3.10
Requires-Dist: isodate==0.6.1
Requires-Dist: kiwisolver==1.4.8
Requires-Dist: libewf-python==20240506
Requires-Dist: looseversion==1.3.0
Requires-Dist: lxml==5.3.2
Requires-Dist: matplotlib==3.10.1
Requires-Dist: networkx==3.4.2
Requires-Dist: nibabel==5.3.2
Requires-Dist: nipype==1.10.0
Requires-Dist: numpy==2.2.4
Requires-Dist: packaging==24.2
Requires-Dist: pandas==2.2.3
Requires-Dist: pathlib==1.0.1
Requires-Dist: pillow==11.1.0
Requires-Dist: prov==2.0.1
Requires-Dist: puremagic==1.28
Requires-Dist: pydot==3.0.4
Requires-Dist: PyMuPDF==1.25.5
Requires-Dist: pyparsing==3.2.3
Requires-Dist: python-dateutil==2.9.0.post0
Requires-Dist: python-docx==1.1.2
Requires-Dist: pytsk3==20250312
Requires-Dist: pytz==2025.2
Requires-Dist: pyxnat==1.6.3
Requires-Dist: rdflib==6.3.2
Requires-Dist: reportlab==4.3.1
Requires-Dist: requests==2.32.3
Requires-Dist: scipy==1.15.2
Requires-Dist: setuptools==79.0.0
Requires-Dist: simplejson==3.20.1
Requires-Dist: six==1.17.0
Requires-Dist: tqdm==4.67.1
Requires-Dist: traits==7.0.2
Requires-Dist: typing_extensions==4.13.1
Requires-Dist: tzdata==2025.2
Requires-Dist: urllib3==2.3.0
Dynamic: author
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

## 🧪 DISKOVERY: Disk Forensics Tool for Data Categorization & Keyword Filtering

**DISKOVERY** is a Python-based digital forensics tool designed to analyze disk images. It performs a multi-stage forensic analysis including imaging, partition parsing, file categorization, keyword-based filtering, and automatic PDF reporting. The tool supports both complete and filtered analysis outputs and provides investigators with a concise overview of disk contents.  It is a command-line interface (CLI) tool that works well on **Ubuntu** and **Debian-based systems**.

---

### ⚙️ Features
- **Disk Image Support** (`.img`, `.E01`, `.dd`)
- **Partition Parsing** using `mmls`
- **File Categorization**:
  - Deleted
  - Encrypted
  - Current
  - Hidden
- **File Type Filtering** (e.g., `.pdf`, `.docx`)
- **Keyword Search** in extracted text-based files
- **Visual Summary** via pie charts
- **PDF Report Generation** with listings, and visualizations

---

### Steps to use
1. Insert pendrive.
2. To check the location at which it's inserted: sudo fdisk -l
3. Go to script folder and run main.py: sudo python3 main.py

---

### 📁 Project Structure
```       
DISKOVERY/
├── stages/
│   ├── __init__.py
│   ├── stage1_disk_imaging.py
│   ├── stage2_extraction.py
│   ├── stage3_categorization.py
│   ├── stage4_filtering.py
│   ├── stage4_2_keyword.py
│   └── stage5_reporting.py
├── utils/
│   ├── __init__.py
│   └── run_command.py
├── main.py
├── LICENSE
├── README.md
├── requirements.txt
├── setup.py
├── MANIFEST.in
└── pyproject.toml
```

---

### 🚀 Quick Start

#### 1. Clone the Repository
```bash
git clone https://github.com/simmithapad/DISKOVERY.git
cd DISKOVERY
```

#### 2. Run Setup (Installs Tools + Python Packages)
```bash
pip install -r requirements.txt
```

#### 3. Start the Tool
```bash
python3 -m venv .venv
source .venv/bin/activate
python3 main.py
```

---

### 🛠️ Dependencies
#### System Tools (Installed via `setup.sh`)
- `dcfldd`
- `sleuthkit` (for `mmls`, `fls`, `fsstat`)
- `binwalk`
- `grep` and `pdfgrep`

#### Python Packages
- `fpdf`
- `elasticsearch`
- `docx2txt`
- `re`

---

### 📄 Output
- Disk images saved in `./output_files/`
- PDF reports saved in `./output_files/reports/`
- Extracted files saved in `./output_files/extracted_files/`

---

### 📬 Future Work
- [ ] GPU Acceleration
- [ ] Memory Forensics Integration

---

### 👤 Author
Simmi Thapad   
Vrinda Abrol

---

### License
 This project is licensed under the MIT License - see the LICENSE file for details.

 ---

### 🔒 Disclaimer
> [!Important]
> This tool is intended for **educational and lawful forensic analysis** only. Use responsibly.
