Metadata-Version: 2.4
Name: dv-api-metrics
Version: 0.2.8
Summary: Python code to collect metrics from datasets in Dataverse collections
Project-URL: Homepage, https://github.com/yourusername/dv-api-metrics
Project-URL: Repository, https://github.com/yourusername/dv-api-metrics
Project-URL: Documentation, https://github.com/yourusername/dv-api-metrics#readme
Author-email: Your Name <your.email@example.com>
License: MIT
License-File: LICENSE
Keywords: api,cafe,dataverse,metrics
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Requires-Dist: certifi>=2025.6.15
Requires-Dist: charset-normalizer==3.4.2
Requires-Dist: idna==3.10
Requires-Dist: jinja2>=3.1.6
Requires-Dist: mako==1.3.10
Requires-Dist: markdown==3.8.2
Requires-Dist: markupsafe==3.0.2
Requires-Dist: pdoc3==0.11.6
Requires-Dist: pdoc==15.0.4
Requires-Dist: polars==1.31.0
Requires-Dist: pygments==2.19.2
Requires-Dist: python-dotenv==1.1.1
Requires-Dist: requests==2.32.4
Requires-Dist: urllib3==2.5.0
Provides-Extra: dev
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mako>=1.3.0; extra == 'docs'
Requires-Dist: markdown>=3.8.0; extra == 'docs'
Requires-Dist: pdoc3>=0.11.0; extra == 'docs'
Requires-Dist: pdoc>=15.0.0; extra == 'docs'
Requires-Dist: pygments>=2.19.0; extra == 'docs'
Description-Content-Type: text/markdown

# dv-api-metrics
- Python code to collect metrics from datasets in [Dataverse](https://dataverse.org/) collections.
- Used primarily to support the CAFE project.
- Note: This library will become obsolete when the Dataverse Hub supports more metrics including dataset download metrics. 

## Objectives
- **Track collection change over time**. Collection change could include tracking unique depositors and datasets created. 
- **Track dataset engagement over time**. Dataset engagement could include tracking dataset and file download counts, as well as dataset citations (where possible). 

## Desired Metrics
- Number of datasets created per month
- Number of collections per month
- Number of dataset downloads per month
- Dataset keyword frequency per month
- Number of (unique) depositors per month
- Number of file downloads per month per dataset
- Make Data Count (MDC) metrics
    - Number of Make Data Count unique downloads per month
    - Number of Make Data Count unique views per dataset
    - Number of Make Data Count citations per dataset
- Total number of harvested datasets
- Number of datasets per subject
- Number of datasets per Geographic Coverage Country / Nation
- Number of datasets per keyword term (available via the UI facets, so should be in search API)

## Additional Metrics
- Number of harvested dataset engagements (Note: not available via APIs)

## Desired Formats
Open questions:
- Are reports preferred to raw data outputs?
    - Raw data (e.g., outputs from API queries)
    - Reports (e.g., summaries of API query output)
## Technical Limitations
- Metrics will be collected using existing Dataverse Metrics API endpoints or Native API endpoints.
- Make Data Count metrics range from 2020-09 to the present.
