Metadata-Version: 2.2
Name: ataserinyelMSA
Version: 0.1.0
Summary: A simple MAFFT-based Multiple Sequence Alignment (MSA) library
Author-email: ataserinyel <clasher.mp2@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/ataserinyel/ataserinyelMSA
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.21.0

# ataserinyelMSA

A simple MAFFT-inspired Multiple Sequence Alignment (MSA) tool written in Python.

## Installation

```bash
pip install ataserinyelMSA
```

## Usage

```bash
python main.py input.fasta output.fasta
```

## Example

Input (`input.fasta`):
```
>seq1
GATTACA
>seq2
GCATGCU
>seq3
AGCTAGC
```
Output (`output.fasta`):
```
>seq1
-GAT-TACA
>seq2
-GC-ATGCU
>seq3
AGCTA-GC-
```
## Algorithm

This tool implements a simplified version of the MAFFT FFT-NS-1 algorithm:

1. **FASTA Parsing** - Read and write FASTA format files
2. **Pairwise Alignment** - Needleman-Wunsch global alignment algorithm
3. **Distance Matrix** - Compute pairwise distances between sequences
4. **Guide Tree** - UPGMA clustering algorithm
5. **Progressive Alignment** - Align sequences following the guide tree order

## Differences from original MAFFT

- Uses Needleman-Wunsch instead of FFT for similarity calculation
- Simple +1/-1 scoring matrix instead of advanced substitution matrices
- Suitable for small datasets

## Author

Ata Serinyel
