Metadata-Version: 2.4
Name: meso-shqip-ai
Version: 0.1.0
Summary: An Albanian educational platform with AI-powered dictionary, book discussions, and quizzes.
Author: Arjon Fejzullahu
License: MIT
Project-URL: Homepage, https://github.com/ArjonFejzullahu/meso-shqip-AI
Project-URL: Repository, https://github.com/ArjonFejzullahu/meso-shqip-AI
Keywords: albanian,education,streamlit,openai,quiz,dictionary,epub
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: beautifulsoup4>=4.12
Requires-Dist: datasets>=4.5.0
Requires-Dist: ebooklib>=0.18
Requires-Dist: lxml>=4.9
Requires-Dist: openai>=1.0.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: streamlit>=1.30.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0.0; extra == "dev"
Dynamic: requires-python

# Mëso Shqip me AI

`Mëso Shqip me AI` is a Python-based educational platform for learning Albanian through vocabulary exploration, literary discussion, and quiz practice. The project combines a reusable Python package with a Streamlit interface so it works both as an application and as a codebase that can be extended or packaged further. It also supports persistent AI book discussions and retrieval-based AI responses grounded in selected EPUB content.

## Overview

The platform helps users:

- search Albanian dictionary entries
- get simple AI explanations for dictionary words
- explore Albanian literary works in EPUB format
- chat with AI about selected books
- practice through multiple quiz modes

## Motivation

Learning Albanian vocabulary and literature often means jumping between separate tools: a dictionary, reading material, notes, and practice exercises. This project brings those pieces together in one Python application so users can look up words, understand meanings, discuss books, and reinforce learning through quizzes in a single workflow.

## Main Features

- Albanian dictionary search from `data/dictionary.csv`
- AI explanation for dictionary words in simple Albanian
- EPUB book assistant for Albanian literary works
- AI chat about selected books using extracted book context
- Quiz modes:
  - `Kuiz i rastësishëm`
  - `Kuiz për libra`
  - `Anglisht → Shqip`

## Technologies Used

- Python
- Streamlit
- OpenAI API
- pandas
- datasets
- EbookLib
- BeautifulSoup
- pytest
- Git
- GitHub

## Project Structure

```text
.
├── data/
│   ├── books/
│   └── dictionary.csv
├── meso_shqip_ai/
│   ├── __init__.py
│   ├── ai.py
│   ├── app_logic.py
│   ├── books.py
│   ├── dictionary.py
│   ├── quiz.py
│   └── utils.py
├── tests/
│   ├── test_books.py
│   ├── test_dictionary.py
│   └── test_quiz.py
├── download_dataset.py
├── pyproject.toml
├── requirements.txt
├── README.md
└── streamlit_app.py
```

## Architecture

The project is organized in four layers:

- UI layer: `streamlit_app.py` handles the Streamlit interface and user interactions.
- Package/business logic layer: `meso_shqip_ai/` contains reusable modules for dictionary lookup, book processing, quizzes, and app orchestration.
- Data layer: `data/dictionary.csv` and `data/books/*.epub` provide the local learning content.
- Retrieval layer: EPUB books are split into text chunks and relevant sections are retrieved before sending context to the AI model.
- AI layer: `meso_shqip_ai/ai.py` integrates the OpenAI API for explanations, book discussion, and quiz generation.

## Setup

1. Create a virtual environment:

```bash
python3 -m venv .venv
source .venv/bin/activate
```

2. Install dependencies:

```bash
pip install -r requirements.txt
```

3. Add a `.env` file for OpenAI-powered features:

```env
OPENAI_API_KEY=your_openai_api_key_here
```

`.env` is required for OpenAI API usage. Do not commit API keys to the repository.

4. Download the dictionary dataset if needed:

```bash
python download_dataset.py
```

5. Run tests:

```bash
pytest
```

6. Run the Streamlit app:

```bash
streamlit run streamlit_app.py
```

## How to Run

```bash
streamlit run streamlit_app.py
```

## Testing

Run the full test suite with:

```bash
pytest
```

The test suite covers dictionary behavior, book processing, and quiz generation logic.

The automated tests currently cover:

- dictionary search behavior
- Albanian character normalization
- book processing
- quiz generation
- quiz repetition avoidance logic

## Python Package Usage

The core logic is reusable outside the Streamlit app.

Dictionary example:

```python
from meso_shqip_ai.dictionary import DictionaryManager

manager = DictionaryManager("data/dictionary.csv")
results = manager.search("libër")
print(results[:3])
```

Books example:

```python
from meso_shqip_ai.books import BookManager

manager = BookManager("data/books")
manager.load_books()
matches = manager.search("Skënderbeu")
print(matches[:2])
```

Quiz example:

```python
from meso_shqip_ai.quiz import QuizManager

quiz_manager = QuizManager()
questions = quiz_manager.create_random_quiz(num_questions=4)
print(questions[0])
```

## PyPI Packaging Note

The project is structured so that the core functionality lives inside the `meso_shqip_ai` Python package rather than only inside the Streamlit script. That makes it suitable for reuse in notebooks, scripts, future APIs, and eventual PyPI packaging with minimal restructuring.

## Future Improvements

- React/Tailwind frontend on top of the existing Python backend
- Qdrant or another vector database for semantic retrieval over books and dictionary content
- semantic search using embeddings
- more Albanian books in EPUB format
- more quiz types and adaptive learning modes
- personalized learning progress
- spaced repetition quiz systems

## Notes

- OpenAI-dependent features require a valid `.env` file with `OPENAI_API_KEY`.
- The application does not expose API keys in code and should keep secrets outside version control.
