Metadata-Version: 2.4
Name: eksi-scraper
Version: 0.1.1
Summary: asynchronously scrapes eksisozluk threads and exports to csv or json
Project-URL: Homepage, https://github.com/iberkayC/eksi-scraper
Project-URL: Source, https://github.com/iberkayC/eksi-scraper
Author-email: Ibrahim Berkay Ceylan <ceylaniberkay@gmail.com>
License-Expression: MIT
License-File: LICENSE
Keywords: asyncio,beautifulsoup,csv,curl,data-collection,eksi,eksi-sozluk,eksisozluk,entries,forum,json,scraper,sozluk,turkce,turkish,web-scraping
Classifier: Development Status :: 4 - Beta
Classifier: Framework :: AsyncIO
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: Turkish
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Internet :: WWW/HTTP :: Browsers
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
Classifier: Topic :: Text Processing :: General
Requires-Python: >=3.10
Requires-Dist: backoff>=2.2
Requires-Dist: beautifulsoup4>=4.12
Requires-Dist: curl-cffi>=0.7
Requires-Dist: lxml>=5.1
Description-Content-Type: text/markdown

# eksi-scraper

fast, asynchronous eksisozluk thread scraper. exports entries to csv or json.

## installation

```bash
uv pip install eksi-scraper
```
or with pip:
```bash
pip install eksi-scraper
```

## usage

```bash
eksi-scraper -t [thread1] [thread2] ... -f [inputFile.txt] -o (csv or json)
```
you can pass full URLs or just the slug (the part of the url after '/' and before '?'). for example:

```bash
eksi-scraper -t https://eksisozluk.com/murat-kurum--2582131 https://eksisozluk.com/ekrem-imamoglu--2577439 -o json
```
or using slugs:
```bash
eksi-scraper -t murat-kurum--2582131 ekrem-imamoglu--2577439 -o json
```
or from a file:
```bash
eksi-scraper -f threads.txt -o csv
```

where in threads.txt, threads are listed as URLs or slugs, one per line:

```
https://eksisozluk.com/murat-kurum--2582131
ekrem-imamoglu--2577439
...
```

## output

each entry has the following fields:

| field | description |
|---|---|
| Content | the entry text, with full URLs restored |
| Author | username of the author |
| Date Created | original post date |
| Last Changed | last edit date, or null if never edited |

## contact

reach out to me at ceylaniberkay@gmail.com
