Metadata-Version: 2.4
Name: mgnipy
Version: 0.1.1
Summary: MGni.py: A Python Wrapper for the MGnify API
Author: Angel L. P., chrisAta, MGS-sails, mberacochea, SandyRogers, Sebastian Ayala Ruano, Alberto Santos, Robert Finn, MGnify
License-Expression: GPL-3.0-or-later
Project-URL: Homepage, https://mgnipy.mgnify.org/
Project-URL: GitHub, https://github.com/EBI-Metagenomics/mgnipy
Project-URL: Documentation, https://mgnipy.mgnify.org/
Project-URL: Bug Tracker, https://github.com/EBI-Metagenomics/mgnipy/issues
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: aiofiles>=25.1.0
Requires-Dist: anndata>=0.10.9
Requires-Dist: async-lru>=2.1.0
Requires-Dist: attrs>=25.4.0
Requires-Dist: bigtree[image,rich]>=1.3.1
Requires-Dist: httpx>=0.28.1
Requires-Dist: ijson>=3.5.0
Requires-Dist: numpy>=2.0.2
Requires-Dist: pandas>=2.3.3
Requires-Dist: platformdirs>=4.5.1
Requires-Dist: polars[rtcompat]>=1.39.3
Requires-Dist: pydantic>=2.12.5
Requires-Dist: pydantic-settings>=2.14.0
Requires-Dist: python-dateutil>=2.9.0.post0
Requires-Dist: pyyaml>=6.0.3
Requires-Dist: requests>=2.32.5
Requires-Dist: scikit-bio>=0.7.2
Requires-Dist: scipy>=1.13.1
Requires-Dist: tqdm>=4.67.1
Dynamic: license-file

# MGni.py

MGni.py (pronounced MAG-nee-pie) is a Python wrapper for the [MGnify API](https://www.ebi.ac.uk/metagenomics/api/docs/). It provides a high-level, Pythonic interface to query metagenomics data and metadata from the MGnify database.

## Features

- **Simple, Pythonic API** — Query MGnify studies, samples, analyses, etc. using an intuitive syntax
- **Sync and Async support** — Built on `httpx` with async/await support
- **Data export** — Multiple output formats including pandas and polars DataFrames
- **Caching** — Option for disk caching to reduce redundant API calls and allow resuming

## Available API Endpoints

- **Studies**: MGnify studies (collections of samples, runs, assemblies and analyses derived from ENA studies/projects).
- **Samples**: MGnify samples (based on ENA/BioSamples; individual biological samples).
- **Runs**: Sequencing runs (ENA run accessions; individual sequencing runs of a sample).
- **Assemblies**: Metagenome assemblies (equivalent to ENA assemblies for one or more runs).
- **Analyses**: Pipeline analyses (results of running MGnify pipelines on runs or assemblies; includes taxonomic and functional annotations).
- **Publications**: Publications that describe or analyse MGnify Studies/datasets.
- **Genomes**: Annotated draft genomes (isolates or MAGs) arranged in biome-specific catalogues.
- **Biomes**: List all biomes in the MGnify database.

#### Note on private data:

- To access your private data in any of these API endpoints you just need your MGnify user and password to obtain a valid sliding auth token via the [MGnify Authentication endpoints](https://www.ebi.ac.uk/metagenomics/api/v2/#/Authentication/token_obtain_sliding).
- for example you can put your login credentials in a `.env` file in your working directory (see [.env.example](https://github.com/EBI-Metagenomics/mgnipy/blob/a9dfdfbb3f669569473e11c7a7c9cf460e6c7d11/.env.example)) and 
- `mgnipy.MGnipyConfig` takes care of getting and caching the auth token so that you can easily access your private data using MGni.py 🎉


## Installation

### From PyPI

```bash
pip install mgnipy
```


### Development installation

```bash
git clone https://github.com/EBI-Metagenomics/mgnipy.git
cd mgnipy
uv sync --all-groups  # or: pip install -e ".[dev,docs]"
```

## Quick Start

### Initialize and explore

```python
from mgnipy import MGnipy

# Create the main client, with default configuration
mg = MGnipy()

# See available endpoints
mg.list_resources()
```

### Query resources with filtering

```python
# Search for studies keyword
studies = mg.studies(
    search="disease"
)

# Can preview requests before fetching
studies.explain()

# get page by page via .get(), getting 3 pages
for _ in range(3)
    studies.get()

# or via .page(), getting another 3 pages
for i in range(4,7):
    studies.page(i)

# OR potentially all at once in large batches (also async option .abulk_fetch())
studies.bulk_fetch()
```

### Multiple output formats

```python
pd_metadata = studies.to_df()

# As polars DataFrame
pl_metadata = studies.to_polars()

# as json
json_metadata = studies.to_json()
```

### Downloading the analyzed data

```python
# accessing the mgazine of datasets
mgazine = studies.datasets

# preview
print(mgazine)

# download file by file 
mgazine.download(to_dir="downloads_folder", alias="mgnify_file_alias.fasta.gz")

# or download all 
mgazine.download_all(to_dir="downloads_folder")
```

## Additional Documentation

- [MGnify API Docs](https://www.ebi.ac.uk/metagenomics/api/v2)
- [openapi-python-client](https://github.com/openapi-generators/openapi-python-client)
- [package docs](https://mgnipy.mgnify.org/)

## Development

see [Contributing.md](Contributing.md)

## License

TODO

## Citation

TODO
