Metadata-Version: 2.4
Name: seq-miner
Version: 1.3.1
Summary: A command-line tool to extract and filter sequence reads from BAM and FASTQ files by ID, quality score, and length.
Author: Aeiwz
Author-email: Your Name <theerayut_aeiw_123@hotmail.com>
License: MIT
Project-URL: Homepage, https://github.com/aeiwz/seq-miner
Project-URL: Documentation, https://github.com/aeiwz/seq-miner#readme
Project-URL: Source, https://github.com/aeiwz/seq-miner
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pysam
Requires-Dist: biopython
Dynamic: author
Dynamic: license-file
Dynamic: requires-python

# seq-miner

**seq-miner** is a lightweight tool to extract and filter reads from **BAM** or **FASTQ** files based on:

- Specific read IDs
- Mean quality score threshold
- Minimum read length
- Multi-threading (FASTQ)
- JSON/CSV-ready summary (optional)


## Installation

```bash
pip install seq-miner
```

Or clone from source:

```bash
git clone https://github.com/your-org/seq-miner.git
cd seq-miner
pip install .
```


## Usage

### Extract reads from BAM

```bash
seq-miner -i reads.bam -o filtered.bam -f bam -r read_ids.txt --min-qscore 10 --min-length 200
```

### Filter FASTQ reads in parallel

```bash
seq-miner -i reads.fastq -o filtered.fastq -f fastq --min-qscore 15 --min-length 1000 --threads 4
```

### Show version

```bash
seq-miner --version
```


## Command-line Options

| Option            | Description                                          |
|-------------------|------------------------------------------------------|
| `-i`, `--input`   | Input BAM or FASTQ file                              |
| `-o`, `--output`  | Output file for passed reads                         |
| `-f`, `--format`  | File format: `bam` or `fastq`                        |
| `-r`, `--read-ids`| Optional file with read IDs (one per line)          |
| `--min-qscore`    | Minimum mean Q-score (default: `0.0`)               |
| `--min-length`    | Minimum read length (default: `0`)                  |
| `--threads`       | Number of CPU threads (only used for FASTQ)         |
| `--verbose`       | Enable verbose logging                               |
| `--version`       | Print the current version                            |


## Output Summary

When finished, you'll see:

```
Summary:
Passed reads     : 12345
Low-quality reads : 54
Short reads      : 91
```

Optionally, you can pipe this to JSON or CSV (coming soon).


## Auto Version + Release

- Version is stored in [`seqminer/__version__.py`](seqminer/__version__.py)
- Tagged automatically with GitHub Actions on push to `main`
- Published to PyPI on GitHub release

## License

MIT License © Theerayut  
See [LICENSE](LICENSE) for full text.


## Contact

For issues, please open an issue on [GitHub](https://github.com/aeiwz/seq-miner/issues).
