Metadata-Version: 2.4
Name: eustatspy
Version: 1.0.0
Summary: A Python wrapper for Eurostat APIs
Author-email: Emanuel Raptis <emanuelraptis@gmail.com>
License: MIT License
        
        Copyright (c) 2025 Emanuel Raptis
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
Project-URL: Homepage, https://github.com/xemarap/eustatspy
Project-URL: Documentation, https://github.com/xemarap/eustatspy/blob/main/README.md
Project-URL: Repository, https://github.com/xemarap/eustatspy
Project-URL: Bug Tracker, https://github.com/xemarap/eustatspy/issues
Keywords: eurostat,statistics,api,data,economics
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE.md
Requires-Dist: requests>=2.25.0
Requires-Dist: pandas>=1.3.0
Provides-Extra: dev
Requires-Dist: pytest>=6.0.0; extra == "dev"
Dynamic: license-file


<picture align="center">
  <source media="(prefers-color-scheme: dark)" srcset="images/eustatspy_dark.png">
  <img alt="Logo" src="images/eustatspy_light.png">
</picture>

# EuStatsPy

A Python wrapper for the Eurostat APIs, providing easy access to European statistical data.

[![Python Versions](https://img.shields.io/badge/python-3.7%20%7C%203.8%20%7C%203.9%20%7C%203.10%20%7C%203.11%20%7C%203.12%20%7C%203.13-blue)](#)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)

**Note:** This is an independent project and is not associated with Eurostat.

## Installation

```bash
pip install eustatspy
```

## Requirements
EuStatsPy requires Python 3.7+ and the following dependencies:

- requests (≥2.25.0) - HTTP library for API communication
- pandas (≥1.0.0) - Data manipulation and analysis

These dependencies are automatically installed when you install EuStatsPy.


## Quick Start

```python
import eustatspy as est

# Initialize the client with caching enabled
client = est.EurostatClient(cache_enabled=True)

# Pre-load metabase for optimal performance (one-time cost)
client.preload_metabase()
```

## Core Functionality

EuStatsPy provides four essential functions for working with Eurostat data:

### 1. Browse Database Structure

Navigate through the Eurostat database hierarchy to discover datasets:

```python
# Start at the root to see main themes
client.browse_database()

# Explore specific themes
client.browse_database('general')  # General statistics
client.browse_database('euroind')  # European indicators
```

### 2. Search Datasets

Find datasets by keyword, with optional date filtering:

```python
# Basic search
results = client.search_datasets("GDP")

# Search with date filter (datasets updated since specific date)
recent_data = client.search_datasets(
    query="unemployment", 
    updated_since="2025-06-01",
    max_results=20
)

# Search for today's updates
today_updates = client.search_datasets(
    query="",  # Empty query = all datasets
    updated_since="2025-07-09",
    max_results=100
)
```

### 3. Explore Dataset Details

Understand dataset dimensions and available filters:

```python
# Get comprehensive dataset information
client.describe_dataset("lfst_r_lfsd2pop")

# See all values for a specific dimension
client.describe_dataset("nama_10_gdp", show_all_for_dimension='geo')

# Limit displayed values per dimension
client.describe_dataset("ei_isbr_m", max_values_per_dimension=5)
```

### 4. Get Data as DataFrame

Retrieve data with flexible filtering options:

#### Geographic Filtering
```python
# Single country
df = client.get_data_as_dataframe('nama_10_gdp', geo='SE')

# Multiple countries
df = client.get_data_as_dataframe(
    'lfst_r_lfsd2pop',
    geo=['SE11', 'DK01']
)

# By geographic level
df = client.get_data_as_dataframe(
    'tour_occ_nin2m',
    geoLevel='country',
    lastTimePeriod=1
)
```

#### Time Filtering
```python
# Specific years
df = client.get_data_as_dataframe(
    'lfst_r_lfsd2pop',
    geo='SE',
    time=['2022', '2023']
)

# Latest periods
df = client.get_data_as_dataframe(
    'nama_10_gdp',
    geo='SE',
    lastTimePeriod=5
)

# Time ranges
df = client.get_data_as_dataframe(
    'ei_isbr_m',
    geo='SE',
    sinceTimePeriod='2024-01',
    untilTimePeriod='2024-12'
)

# Data from specific time
df = client.get_data_as_dataframe(
    'nama_10_gdp',
    geo='SE',
    sinceTimePeriod='2020'
    )
```

#### Multi-dimensional Filtering
```python
# Complex filtering with multiple dimensions
df = client.get_data_as_dataframe(
    'lfst_r_lfsd2pop',
    geo=['SE11', 'SE12'],
    age='Y25-64',
    isced11=['ED0-2', 'ED3_4', 'ED5-8'],
    sex=['M', 'F'],
    lastTimePeriod=3
)
```

## Common Filter Parameters

- **geo**: Geographic areas - `'SE'`, `['SE', 'DK']`, or `'all'`
- **time**: Time periods - `'2020'`, `['2020', '2021']`, `'2020-Q1'`
- **geoLevel**: Geographic level - `'country'`, `'nuts1'`, `'nuts2'`, `'nuts3'`, `'city'`, `'aggregate'`
- **lastTimePeriod**: Number of latest periods - `1`, `5`, `10`
- **sinceTimePeriod**: Start period - `'2020'`, `'2020-Q1'`, `'2020-01'`
- **untilTimePeriod**: End period - `'2023'`, `'2023-Q4'`, `'2023-12'`

Plus dataset-specific dimensions like `unit`, `na_item`, `sex`, `age` etc.

## Performance Tips

1. **Enable caching** for faster repeated queries:
   ```python
   client = est.EurostatClient(cache_enabled=True)
   ```

2. **Pre-load metabase** for instant dataset exploration:
   ```python
   client.preload_metabase()  # One-time cost
   # Now all describe_dataset() calls are instant!
   ```

3. **Use specific filters** to reduce data size:
   ```python
   # Instead of getting all data
   df = client.get_data_as_dataframe('nama_10_gdp')
   
   # Filter to what you need
   df = client.get_data_as_dataframe(
       'nama_10_gdp',
       geo='SE',
       unit='CP_MEUR',
       lastTimePeriod=5
   )
   ```

## Data Frequency Examples

- **Annual data**: `nama_10_gdp` (GDP), `lfst_r_lfsd2pop` (Population)
- **Quarterly data**: `tipsbp53` (Balance of payments), `namq_10_gdp` (GDP quarterly)
- **Monthly data**: `ei_isbr_m` (Industrial production), `tour_occ_nin2m` (Tourism)

## Error Handling

```python
try:
    df = client.get_data_as_dataframe('invalid_dataset')
except est.DatasetNotFoundError:
    print("Dataset not found")
except est.InvalidParameterError as e:
    print(f"Invalid parameters: {e}")
except est.EurostatAPIError as e:
    print(f"API error: {e}")
```

## Cache Management

```python
# Clear cache when needed
client.clear_cache()

# Check if metabase is loaded
if not client.is_metabase_loaded():
    client.preload_metabase()
```

## Example Workflow

```python
import eustatspy as est

# 1. Initialize and setup
client = est.EurostatClient(cache_enabled=True)
client.preload_metabase()

# 2. Discover data
client.browse_database()  # Explore themes
results = client.search_datasets("employment")  # Find datasets

# 3. Understand dataset
client.describe_dataset("lfst_r_lfsd2pop")  # Explore dimensions

# 4. Get data
df = client.get_data_as_dataframe(
    'lfst_r_lfsd2pop',
    geo=['SE11', 'DK01'],
    age='Y25-64',
    lastTimePeriod=5
)

print(df.head())
```

## Language Support

This package returns all data and metadata in English only for consistency and simplicity.

## Contributing

Contributions are welcome! Please see our [contributing guidelines](CONTRIBUTING.md) for details.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Dependency Licenses
EuStatsPy includes the following dependencies:

**Runtime Dependencies:**
- requests
- pandas

**Development/Testing Dependencies (not distributed):**
- pytest
- pytest-cov
- pytest-mock

All dependency licenses are available in the `LICENSES/` directory.

## Acknowledgments

- Data provided by [Eurostat](https://ec.europa.eu/eurostat)
- Built using the official Eurostat APIs
