nrp_cmd

NRP Commandline Tools and Python Client

Overview

This package provides a set of libraries and command-line tools for interacting with repositories that conform to the Czech National Repository Platform. Although support currently focuses on InvenioRDM-based repositories, the architecture is designed to be easily extensible to other repository types.

Configuration

The package uses a configuration file stored in the user's home directory under the .nrp folder. You can add a new repository to the configuration by running:

nrp-cmd add repository https://my-repository.org [--alias my-repo]

This command will guide you through setting up the repository URL and any required authentication tokens. If you prefer not to use the built-in configuration mechanism, you can provide the necessary parameters directly to the client.

API

This package offers both synchronous and asynchronous clients for working with the configured repositories. Choose the asynchronous client if you have an asyncio-based application or need higher performance for data transfers. For simpler applications, the synchronous client is often sufficient.

Both clients share the same high-level API, allowing you to switch between them easily (with the appropriate async/await adjustments as needed).

Example

A simple script that searches for records containing the word "Einstein" in metadata and downloads all matching records to a local directory. By default it will download data in 10 concurrent connections (see nrp_cmd.get_async_client() for more details).

from nrp_cmd import get_async_client
from nrp_cmd.async_client import AsyncRepositoryClient, download

async def run():
    client: AsyncRepositoryClient = await get_async_client("my-repo")
    client: AsyncRepositoryClient = await get_async_client("https://my-repository.org")

    async for record in client.records.scan(q="Einstein"):
        print(record.metadata)
        # and store the record together with files in a directory for further processing
        download(record, f"/path/to/download/{record.id}", with_files=True)

if __name__ == "__main__":
    import asyncio
    asyncio.run(run())

See nrp_cmd.async_client.base_client.AsyncRepositoryClient for more details.

 1"""
 2# NRP Commandline Tools and Python Client
 3
 4## Overview
 5
 6This package provides a set of libraries and command-line tools for interacting 
 7with repositories that conform to the Czech National Repository Platform. 
 8Although support currently focuses on InvenioRDM-based repositories, the 
 9architecture is designed to be easily extensible to other repository types.
10
11## Configuration
12
13The package uses a configuration file stored in the user's home directory under
14the `.nrp` folder. You can add a new repository to the configuration by running:
15
16```bash
17nrp-cmd add repository https://my-repository.org [--alias my-repo]
18```
19
20This command will guide you through setting up the repository URL and any 
21required authentication tokens. If you prefer not to use the built-in 
22configuration mechanism, you can provide the necessary parameters directly 
23to the client.
24
25## API
26
27This package offers both synchronous and asynchronous clients for working 
28with the configured repositories. Choose the asynchronous client if you have 
29an asyncio-based application or need higher performance for data transfers. 
30For simpler applications, the synchronous client is often sufficient.
31
32Both clients share the same high-level API, allowing you to switch between them 
33easily (with the appropriate `async`/`await` adjustments as needed).
34
35## Example
36
37A simple script that searches for records containing the word "Einstein" in metadata
38and downloads all matching records to a local directory. By default it will download
39data in 10 concurrent connections (see :func:`nrp_cmd.get_async_client` for more details).
40
41```python
42from nrp_cmd import get_async_client
43from nrp_cmd.async_client import AsyncRepositoryClient, download
44
45async def run():
46    client: AsyncRepositoryClient = await get_async_client("my-repo")
47    client: AsyncRepositoryClient = await get_async_client("https://my-repository.org")
48
49    async for record in client.records.scan(q="Einstein"):
50        print(record.metadata)
51        # and store the record together with files in a directory for further processing
52        download(record, f"/path/to/download/{record.id}", with_files=True)
53    
54if __name__ == "__main__":
55    import asyncio
56    asyncio.run(run())
57```
58
59See :class:`nrp_cmd.async_client.base_client.AsyncRepositoryClient` for more details.
60
61"""
62
63from . import async_client, config, types
64from .async_client import get_async_client
65
66__all__ = ("get_async_client", "async_client", "config", "types")
async def get_async_client( repository: str | yarl.URL, refresh: bool = False, limiter: nrp_cmd.async_client.connection.Limiter | None = None, config: nrp_cmd.config.Config | None = None) -> nrp_cmd.async_client.AsyncRepositoryClient:
24async def get_async_client(
25    repository: str | URL,
26    refresh: bool = False,
27    limiter: Limiter | None = None,
28    config: Config | None = None,
29) -> AsyncRepositoryClient:
30    """
31    Get an asynchronous client for the given repository.
32
33    :param repository: the repository alias or URL
34    :param refresh: whether to refresh the client configuration from the server
35    :param max_connections: the maximum number of parallel connections
36    :param config: the configuration to use. If not given, the configuration is loaded
37    from the configuration file.
38    :return: an asynchronous client for the repository
39    """
40    if not config:
41        config = Config.from_file()
42
43    repository_config = config.find_repository(repository)
44    for async_client_class in async_client_classes():
45        if await async_client_class.can_handle_repository(repository_config.url):
46            return await async_client_class.from_configuration(
47                repository_config, refresh=refresh, limiter=limiter
48            )
49    raise ValueError(f"No async client found for repository {repository_config.url}")

Get an asynchronous client for the given repository.

Parameters
  • repository: the repository alias or URL
  • refresh: whether to refresh the client configuration from the server
  • max_connections: the maximum number of parallel connections
  • config: the configuration to use. If not given, the configuration is loaded from the configuration file.
Returns

an asynchronous client for the repository