Metadata-Version: 2.4
Name: biosuitestkod
Version: 3.0.3
Summary: BioSuite Ultra - Comprehensive open-source bioinformatics platform with 48 analysis modules
Author: Sahand Touri
License: MIT
Project-URL: Homepage, https://github.com/sahandtouri/BioSuite-Better
Keywords: bioinformatics,genomics,proteomics,transcriptomics,sequence-analysis,ngs,machine-learning
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Requires-Dist: matplotlib>=3.7
Requires-Dist: seaborn>=0.12
Requires-Dist: scipy>=1.10
Requires-Dist: scikit-learn>=1.3
Requires-Dist: customtkinter>=5.2
Requires-Dist: tqdm>=4.65
Requires-Dist: biopython>=1.81
Requires-Dist: goatools>=1.3
Requires-Dist: gseapy>=1.0
Requires-Dist: cutadapt>=4.0
Requires-Dist: scanpy>=1.9
Requires-Dist: anndata>=0.9
Requires-Dist: scikit-bio>=0.5
Requires-Dist: biotite>=0.37
Requires-Dist: networkx>=3.0
Requires-Dist: plotly>=5.0
Requires-Dist: ete3>=3.1
Requires-Dist: cobra>=3.0
Requires-Dist: shap>=0.42
Requires-Dist: statsmodels>=0.14
Requires-Dist: umap-learn>=0.5
Provides-Extra: full
Requires-Dist: pysam>=0.21; extra == "full"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Dynamic: license-file

# BioSuite Ultra

![Python](https://img.shields.io/badge/Python-3.10+-blue)
![License](https://img.shields.io/badge/License-MIT-green)
![Tests](https://img.shields.io/badge/Tests-271-passing-brightgreen)
![Modules](https://img.shields.io/badge/Modules-48-orange)
![Lines](https://img.shields.io/badge/Lines-21000+-yellow)

**The most comprehensive open-source bioinformatics platform.**

BioSuite Ultra is a full-stack bioinformatics platform with 48 analysis modules, 36+ visualization types, a cyberpunk GUI, and a 99+ option CLI — all in pure Python. No external binaries required.

---

## Features

### 48 Analysis Modules

| Domain | Modules | Coverage |
|--------|---------|----------|
| Sequence Analysis | FASTA/FASTQ I/O, GC%, translation, reverse complement, ORF finder, primer design, restriction enzymes, codon usage | 80% |
| Alignment | Needleman-Wunsch, Smith-Waterman, BLAST (k-mer), MSA (progressive + Clustal) | 70% |
| Phylogenetics | p-distance, UPGMA, NJ, ML (RAxML), Bayesian (MrBayes) | 85% |
| Transcriptomics | CPM/TPM normalization, differential expression, GO/KEGG enrichment | 60% |
| NGS/Genomics | BAM/VCF parsing, read alignment (BWA/Bowtie2), variant calling, SV/CNV detection | 65% |
| Single-Cell | Scanpy-based scRNA-seq pipeline | 80% |
| Proteins | PDB analysis, ESMFold structure prediction, molecular docking | 50% |
| Epigenomics | Bisulfite methylation, DMR detection | 40% |
| Metagenomics | K-mer classifier, 16S rRNA pipeline, alpha/beta diversity | 65% |
| Metabolomics | Peak detection, ANOVA, feature alignment | 50% |
| Population Genetics | HWE, FST, Tajima's D, LD, PCA | 70% |
| CRISPR | Guide RNA design, PAM finding, off-target scoring | 70% |
| Metabolism | Flux balance analysis (FBA), knockout simulation | 55% |
| Machine Learning | Random Forest, SVM, SHAP, cross-validation | 50% |
| Workflow | Pipeline builder, batch processor, HTML report generator | 80% |
| GO/Pathways | GO browser, pathway visualization (KEGG-style maps) | 60% |
| GWAS | Chi-squared test, Manhattan/QQ plots, lead SNP detection | 70% |
| Epitope Prediction | T-cell (MHC binding), B-cell (surface propensity), linear epitopes | 70% |

### 36+ Visualization Types

Volcano, PCA, Manhattan, MA, Venn, Barplot, Boxplot, Heatmap, Scatter, Time Series, QQ-plot, Clustered Heatmap, Circos, Alignment Viewer, Violin, Raincloud, Ridge, Dot Plot, GSEA, Motif Logo, Sankey, UMAP, Network (PPI/Regulatory/Metabolic), UpSet, Genome Browser, Interactive (Plotly), Sequence Logo, Conservation, Synteny Dotplot, and more.

### Dual-Mode Architecture

Every module follows a consistent pattern:
```python
def analyze(input, ...):
    # Try external tool first (fast)
    if _has_external_tool():
        return _run_external(input, ...)
    # Fall back to pure Python (always works)
    return _run_builtin(input, ...)
```

### Cyberpunk GUI

- 29 analysis tabs with scrollable sidebar
- 3 themes: Dark-Green-Cyber, Dark-Purple-Cyber, Light-Blue-Cyber
- Keyboard shortcuts (Ctrl+S, Ctrl+Q, F1, F5, Escape)
- Progress bars for long operations
- Plot history (last 10 plots)
- API key configuration panel
- 15 built-in help guides

### CLI with 99+ Options

Professional CLI menu with organized sections for every analysis type.

---

## Installation

```bash
# Clone the repository
git clone https://github.com/yourusername/BioSuite-Better.git
cd BioSuite-Better

# Install dependencies
pip install -r requirements.txt

# Or install individually
pip install numpy pandas matplotlib seaborn scipy scikit-learn
pip install biopython customtkinter tqdm goatools gseapy
pip install scanpy anndata pysam scikit-bio biotite networkx
pip install plotly ete3 cobra shap statsmodels
```

## Quick Start

### CLI Mode
```bash
python run.py
```

### GUI Mode
```bash
python run.py --gui
```

### Programmatic API
```python
from bioplatter.core.sequence import gc_content, reverse_complement, translate
from bioplatter.core.alignment import needleman_wunsch, smith_waterman
from bioplatter.core.phylogeny import distance_matrix, upgma_tree
from bioplatter.core.workflow.pipeline import Pipeline
from bioplatter.plotting.upset_plots import plot_upset

# Quick analysis
gc = gc_content("ATCGATCG")  # 50.0
rc = reverse_complement("ATCG")  # "CGAT"
protein = translate("ATGAAATTTTAA")  # "MKF"

# Pipeline
p = Pipeline("my_analysis")
p.add_step("gc", gc_content, args=("ATCGATCG",))
p.add_step("revcomp", reverse_complement, args=("ATCGATCG",))
p.run()
print(p.results)
```

---

## Project Structure

```
BioSuite-Better/
├── run.py                    # Entry point
├── bioplatter/
│   ├── core/                 # 48 analysis modules
│   │   ├── sequence.py       # Sequence I/O & analysis
│   │   ├── alignment.py      # Pairwise alignment (vectorized)
│   │   ├── blast.py          # Sequence search
│   │   ├── msa.py            # Multiple sequence alignment
│   │   ├── phylogeny.py      # Distance-based trees
│   │   ├── ml_phylogeny.py   # ML trees (NJ + RAxML)
│   │   ├── bayesian_phylogeny.py  # Bayesian trees
│   │   ├── expression.py     # Differential expression
│   │   ├── enrichment.py     # GO/KEGG enrichment
│   │   ├── single_cell.py    # scRNA-seq (scanpy)
│   │   ├── ngs.py            # BAM/VCF utilities
│   │   ├── read_aligner.py   # Read mapping
│   │   ├── variant_calling.py # Variant detection + SV/CNV
│   │   ├── peak_calling.py   # ChIP-seq peaks
│   │   ├── assembly.py       # Genome assembly
│   │   ├── metagenomics.py   # Taxonomic classification + 16S
│   │   ├── trimming.py       # Read QC
│   │   ├── quantification.py # RNA-seq quantification
│   │   ├── structure.py      # PDB analysis
│   │   ├── structure_prediction.py  # Protein structure
│   │   ├── docking.py        # Molecular docking
│   │   ├── crispr.py         # Guide RNA design
│   │   ├── metabolism.py     # Flux balance analysis
│   │   ├── popgen.py         # Population genetics
│   │   ├── epigenomics.py    # Methylation analysis
│   │   ├── metabolomics.py   # Mass spec analysis
│   │   ├── md_simulation.py  # Molecular dynamics
│   │   ├── bio_ml.py         # Machine learning
│   │   ├── orf_finder.py     # ORF, restriction enzymes, primers
│   │   ├── codon_usage.py    # Codon tables, k-mer, complexity
│   │   ├── survival.py       # Kaplan-Meier, Cox PH
│   │   ├── file_formats.py   # BED/GFF/Newick/Stockholm/BigWig
│   │   ├── databases.py      # NCBI/UniProt/PDB/KEGG/Ensembl
│   │   ├── go_browser.py     # Gene Ontology browser
│   │   ├── pathway_viz.py    # Pathway visualization
│   │   ├── gwas.py           # GWAS analysis
│   │   ├── epitope.py        # Epitope prediction
│   │   └── workflow/         # Pipeline, batch, report
│   ├── plotting/             # 36+ visualization types
│   ├── gui/                  # Cyberpunk GUI (29 tabs)
│   ├── cli/                  # CLI with 99+ options
│   └── tests/                # 271 automated tests
├── examples/                 # Jupyter notebooks
└── requirements.txt
```

---

## Testing

```bash
# Run all 271 tests
python -m pytest tests/ -v

# Run specific test file
python -m pytest tests/test_phase4.py -v

# Run with short traceback
python -m pytest tests/ -v --tb=short
```

---

## Requirements

All pip-installable:
```
numpy, pandas, matplotlib, seaborn, scipy, scikit-learn
biopython, customtkinter, tqdm, goatools, gseapy
scanpy, anndata, pysam, scikit-bio, biotite, networkx
plotly, ete3, cobra, shap, statsmodels
```

---

## Platform

- **OS:** Windows, macOS, Linux
- **Python:** 3.10+
- **GPU:** Not required (CPU-only)
- **RAM:** 4GB minimum, 8GB recommended
- **External tools:** Optional (BLAST+, Clustal Omega, etc. provide speed boosts)

---

## Author

**Sahand Touri**
Molecular Cell Biology Student, Urmia IAU, Iran

Built as a comprehensive bioinformatics portfolio project demonstrating:
- Full-stack software engineering (48 modules, 21,000+ lines, 271 tests)
- Domain expertise across 20+ bioinformatics areas
- Dual-mode architecture (pure Python + optional external tools)
- Professional GUI and CLI design
- Automated testing and quality assurance

---

## License

MIT License - Free for academic and commercial use.
