Metadata-Version: 2.4
Name: csv-citation-counter
Version: 0.1.0
Summary: Summarize CSV publication and citation data by column.
Author: Tiger Deng
License-Expression: MIT
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown

# 📊 CSV Citation Counter

This Python script processes one or more CSV files to summarize publication and citation data by a chosen column — such as journal names, author names, or institutions.

It supports:

- ✅ Counting how many times each entry appears (e.g., number of articles per journal)
- ✅ Summing citations per entry using a `"Cited by"` column
- ✅ Calculating average and max citations per entry
- ✅ Sorting results by:
  - Number of articles [articles]
  - Total citations [total]
  - Average citations per article [avg]
- ✅ Handling multi-entry fields (e.g., multiple authors separated by `; `)
- ✅ Writing clean summaries to a human-readable `summary.txt`

---

## 📦 Requirements

No external libraries required — works with standard Python 3.

---

## 🚀 How to Run the Script

You can run the script in **two ways**:

---

### ✅ Option 1: Command Line Arguments

```bash
python script.py \
  --files data/file1.csv data/file2.csv \
  --column "Authors" \
  --split "; " \
  --sortby avg
```

### ✅ Option 2: Interactive Mode

You'll be prompted to: 
##### Enter file paths or folders:
Input one path per line (CSV file or folder)

Hit Enter on an empty line when you're done

All .csv files in a folder will be included

##### Column name to analyze (e.g., Journal, Authors)
Delimiter (optional) if each csv entry has multiple elements (such as authors):

Leave blank for single-entry fields

Make sure to add a space (like "; ") if necessary.

##### Sort method (optional):
articles → by number of articles (default)

total → by total citations

avg → by average citations per article

### Output
The summary is recorded to a summary.txt in the same directory where this script was run. Blank lines separated groups with differnet values for the sort attribute.



