Metadata-Version: 2.4
Name: qa-generator
Version: 0.1.0
Summary: A Python package to generate question-answer pairs from plain text
Home-page: https://github.com/hemanth/qa-generator
Author: QA Generator Team
Author-email: 
License: MIT
Project-URL: Homepage, https://github.com/hemanth/qa-generator
Project-URL: Repository, https://github.com/hemanth/qa-generator
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: nltk>=3.8
Requires-Dist: requests>=2.25.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: black>=22.0.0; extra == "dev"
Requires-Dist: flake8>=4.0.0; extra == "dev"
Provides-Extra: llm
Requires-Dist: transformers>=4.20.0; extra == "llm"
Requires-Dist: torch>=1.12.0; extra == "llm"
Provides-Extra: datasets
Requires-Dist: pandas>=1.3.0; extra == "datasets"
Requires-Dist: datasets>=2.0.0; extra == "datasets"
Requires-Dist: huggingface_hub>=0.10.0; extra == "datasets"
Provides-Extra: advanced
Requires-Dist: spacy>=3.4.0; extra == "advanced"
Provides-Extra: all
Requires-Dist: transformers>=4.20.0; extra == "all"
Requires-Dist: torch>=1.12.0; extra == "all"
Requires-Dist: pandas>=1.3.0; extra == "all"
Requires-Dist: datasets>=2.0.0; extra == "all"
Requires-Dist: huggingface_hub>=0.10.0; extra == "all"
Requires-Dist: spacy>=3.4.0; extra == "all"
Dynamic: home-page
Dynamic: requires-python

# QA Generator

Generate question-answer pairs from text using rule-based and LLM approaches.

## Installation

```bash
pip install qa-generator
```

## Quick Start

### Rule-based Generation
```python
from qa_generator import QAGenerator

qa_gen = QAGenerator()
text = "Python is a programming language created by Guido van Rossum in 1991."
qa_pairs = qa_gen.generate(text)

for question, answer in qa_pairs:
    print(f"Q: {question}")
    print(f"A: {answer}")
```

### LLM-based Generation
```python
from qa_generator import LLMQAGenerator

qa_gen = LLMQAGenerator(
    api_key="your-api-key",
    model="gpt-3.5-turbo"
)
qa_pairs = qa_gen.generate(text, max_pairs=5, difficulty="medium")
```

### Easy Provider Setup
```python
from qa_generator import create_qa_generator_from_provider

qa_gen = create_qa_generator_from_provider("openai")   # OpenAI
qa_gen = create_qa_generator_from_provider("ollama")   # Local Ollama
qa_gen = create_qa_generator_from_provider("together") # Together AI
```

## Dataset Generation
```python
from qa_generator import DatasetGenerator

dataset_gen = DatasetGenerator()
dataset_splits = dataset_gen.generate_comprehensive_dataset(
    total_samples=1000,
    use_llm=True
)

# Export for Hugging Face
file_paths = dataset_gen.export_to_huggingface_format(
    dataset_splits, 
    dataset_name="my_qa_dataset"
)
```

## Features

- **Rule-based**: Template questions with entity recognition
- **LLM-based**: High-quality questions via language models  
- **Multi-provider**: OpenAI, Ollama, Together, Groq, local endpoints
- **Dataset creation**: Hugging Face compatible exports
- **Difficulty levels**: Easy, medium, hard
- **Question types**: Factual, conceptual, analytical

## Requirements

Python 3.8+, NLTK, requests
