Metadata-Version: 2.4
Name: chronowords
Version: 0.2.0
Summary: Detect semantic shifts in word embeddings over time
Author-email: Orsolya Putz <orsolya.putz@crowintelligence.org>, Zoltan Varju <zoltan.varju@crowintelligence.org>
License-Expression: MIT
Project-URL: Homepage, https://github.com/crow-intelligence/chronowords
Project-URL: Repository, https://github.com/crow-intelligence/chronowords
Project-URL: Documentation, https://chronowords.readthedocs.io
Keywords: nlp,embeddings,semantic-change,topic-modeling
Requires-Python: <3.13,>=3.10
Description-Content-Type: text/markdown
Requires-Dist: numpy<3,>=1.26.0
Requires-Dist: scipy<2,>=1.12.0
Requires-Dist: cython<4,>=3.0.11
Requires-Dist: setuptools>=75.8.0
Requires-Dist: mmh3<6,>=5.0.1
Requires-Dist: nltk<4,>=3.9.1
Requires-Dist: scikit-learn<2,>=1.6.1

# chronowords

Detect semantic shifts over time in word embeddings. Train small PPMI-based language models, create topic models using NMF, and analyze semantic changes using Procrustes alignment.

## Features

- Memory-efficient word embedding training using Count-Min Sketch
- Topic modeling with Non-negative Matrix Factorization
- Temporal alignment of word embeddings using Procrustes analysis
- Cython-optimized PPMI matrix computation

## Installation

```bash
pip install chronowords
```

## Quick Start
```python
from chronowords.algebra import SVDAlgebra
from chronowords.topics import TopicModel

# Train word embeddings
model = SVDAlgebra(n_components=300)
model.train(your_corpus_iterator)

# Find similar words
similar = model.most_similar('computer')
for word in similar:
    print(f"{word.word}: {word.similarity:.3f}")

# Create topic model
topic_model = TopicModel(n_topics=10)
topic_model.fit(ppmi_matrix, vocabulary)
```

## Documentation
Full documentation available at ReadTheDocs.

## Requirements

Python ≥ 3.10
NumPy
SciPy
scikit-learn
Cython

## Contributing
Pull requests welcome. For major changes, open an issue first.

## License
MIT
