Metadata-Version: 2.4
Name: invenio-analytics-importer
Version: 0.2.0
Summary: Import analytics from 3rd-party provider (Matomo) into InvenioRDM instance.
Author-email: Northwestern University <DL_FSM_GDS@e.northwestern.edu>
License: MIT License
        
        Copyright (C) 2025 Northwestern University.
        
        Permission is hereby granted, free of charge, to any person obtaining a copy of
        this software and associated documentation files (the "Software"), to deal in
        the Software without restriction, including without limitation the rights to
        use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
        of the Software, and to permit persons to whom the Software is furnished to do
        so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Repository, https://github.com/galterlibrary/invenio-analytics-importer
Keywords: invenio,inveniordm,rdm,analytics,statistics,matomo
Classifier: Environment :: Web Environment
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Topic :: Internet :: WWW/HTTP :: Dynamic Content
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx[cli]>=0.28.1
Requires-Dist: invenio-app-rdm<14.0.0,>=12.0.0
Requires-Dist: click
Requires-Dist: flask
Requires-Dist: invenio_db
Requires-Dist: invenio-rdm-records
Requires-Dist: invenio-search
Requires-Dist: sqlalchemy
Provides-Extra: dev
Requires-Dist: check-manifest>=0.49; extra == "dev"
Requires-Dist: invenio-search[opensearch2]; extra == "dev"
Requires-Dist: invoke<3.0,>=2.2; extra == "dev"
Requires-Dist: pytest-asyncio>=1.2.0; extra == "dev"
Requires-Dist: pytest-invenio>=3.4.2; extra == "dev"
Requires-Dist: time-machine<3.0.0,>=2.12.0; extra == "dev"
Dynamic: license-file

# README

CLI tool to retrieve and ingest analytics from a provider into current InvenioRDM instance.

For now, only Matomo is supported. If other provider added, slight refactor to support them will be planned.

## Install

```bash
pip install invenio-analytics-importer
```

## Usage

**Retrieve analytics**

```bash
pipenv run invenio analytics_importer retrieve [--views|--downloads] --from <YYYY-MM> --to <YYYY-MM> --output-dir <path>/<to>/<data>/
```

If neither `--views` nor `--downloads` is passed, views will be the default. If both are passed, the last one on the CLI will be chosen.
`--from` and `--to` are inclusive year-month dates.

This downloads analytics into files corresponding to each year-month. The structure of each file is e.g.,:

```json
{
    "2024-08-01": [
        {
            // 1 "raw" analytics entry from provider
            // corresponding to 1 URL
        },
        // ...
    ],
    "2024-08-02": [
    // ...
    ],
    // ...
}
```

**Ingest**

```bash
pipenv run invenio analytics_importer ingest [--views|--downloads] -f <analytics file 1> -f <analytics file 2> ...
```

Analytics files are of the shape described above, although there is no
requirements for 1 file to correspond to 1 month. However, there is an
assumption/requirement that each file's date (`YYYY-MM-DD`) in
`"YYYY-MM-DD": [...analytics...]` is unique across all files.
