Metadata-Version: 2.4
Name: google-analytic-process
Version: 0.1.2
Summary: A simple API for running Google Analytics 4 reports
Home-page: https://github.com/yourusername/ga4-reporter
Author: Sang Pham
Author-email: sang@polaroid.com
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: google-analytics-data>=0.16.0
Requires-Dist: pandas>=1.3.0
Requires-Dist: numpy>=1.21.0
Requires-Dist: protobuf>=4.21.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=3.0; extra == "dev"
Requires-Dist: black>=22.0; extra == "dev"
Requires-Dist: flake8>=4.0; extra == "dev"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# GA4 Reporter

A simple and clean Python API for running Google Analytics 4 reports. This package provides an easy-to-use interface for extracting data from GA4 properties using the Google Analytics Data API.

## Features

- Simple API for running GA4 reports
- Returns data as pandas DataFrames
- Supports custom dimensions and metrics
- Flexible date range selection
- Both class-based and functional interfaces

## Installation

```bash
pip install ga4-reporter
```

## Prerequisites

1. A Google Analytics 4 property
2. A service account with access to your GA4 property
3. Service account credentials JSON file

### Setting up Google Analytics 4 API Access

1. Go to [Google Cloud Console](https://console.cloud.google.com/)
2. Create a new project or select an existing one
3. Enable the Google Analytics Data API
4. Create a service account and download the credentials JSON file
5. In GA4, add the service account email to your property with "Viewer" permissions

## Usage

### Using the convenience function (Recommended)

```python
from ga4_reporter import run_report
from datetime import datetime, timedelta

# Define date range
end_date = datetime.now().date()
start_date = end_date - timedelta(days=7)

# Define dimensions and metrics
dimensions = [
    "date",
    "sessionDefaultChannelGroup",
    "sessionCampaignName"
]

metrics = [
    "sessions",
    "transactions",
    "totalRevenue"
]

# Run the report
df = run_report(
    dimensions=dimensions,
    metrics=metrics,
    start_date=start_date,
    end_date=end_date,
    credentials_path="/path/to/credentials.json",
    property_id="276493948"
)

print(df.head())
```

### Using the GA4Reporter class

```python
from ga4_reporter import GA4Reporter
from datetime import datetime, timedelta

# Initialize the reporter
reporter = GA4Reporter(
    property_id="276493948",
    credentials_path="/path/to/credentials.json"
)

# Define parameters
dimensions = ["date", "deviceCategory"]
metrics = ["sessions", "totalUsers"]
start_date = datetime(2024, 1, 1)
end_date = datetime(2024, 1, 31)

# Run the report
df = reporter.run_report(
    dimensions=dimensions,
    metrics=metrics,
    start_date=start_date,
    end_date=end_date
)

print(df.head())
```

## API Reference

### `run_report()` function

Convenience function to run a GA4 report without instantiating the class.

**Parameters:**
- `dimensions` (List[str]): List of dimension names (e.g., `["date", "sessionDefaultChannelGroup"]`)
- `metrics` (List[str]): List of metric names (e.g., `["sessions", "totalRevenue"]`)
- `start_date` (str | date | datetime): Start date for the report
- `end_date` (str | date | datetime): End date for the report
- `credentials_path` (str): Path to the service account credentials JSON file
- `property_id` (str, optional): The GA4 property ID. Defaults to `"276493948"`
- `limit` (int, optional): Maximum number of rows to return. Defaults to `1000000`
- `offset` (int, optional): Number of rows to skip. Defaults to `0`

**Returns:**
- `pd.DataFrame`: DataFrame containing the report data

### `GA4Reporter` class

#### `__init__(property_id, credentials_path)`

Initialize the GA4Reporter.

**Parameters:**
- `property_id` (str): The GA4 property ID
- `credentials_path` (str): Path to the service account credentials JSON file

#### `run_report(dimensions, metrics, start_date, end_date, limit=1000000, offset=0)`

Run a Google Analytics 4 report.

**Parameters:** Same as the `run_report()` function (except `credentials_path` and `property_id` which are set during initialization)

**Returns:**
- `pd.DataFrame`: DataFrame containing the report data

## Available Dimensions and Metrics

For a complete list of available dimensions and metrics, refer to the [Google Analytics Data API documentation](https://developers.google.com/analytics/devguides/reporting/data/v1/api-schema).

### Common Dimensions:
- `date`
- `sessionDefaultChannelGroup`
- `sessionCampaignName`
- `deviceCategory`
- `countryId`
- `pagePath`
- `eventName`

### Common Metrics:
- `sessions`
- `totalUsers`
- `transactions`
- `totalRevenue`
- `engagementRate`
- `bounceRate`
- `screenPageViews`

## Examples

### Example 1: Basic Usage

```python
from ga4_reporter import run_report

df = run_report(
    dimensions=["date"],
    metrics=["sessions"],
    start_date="2024-01-01",
    end_date="2024-01-31",
    credentials_path="/path/to/credentials.json",
    property_id="YOUR_PROPERTY_ID"
)
```

### Example 2: Multiple Dimensions and Metrics

```python
from ga4_reporter import run_report
from datetime import datetime, timedelta

end_date = datetime.now().date()
start_date = end_date - timedelta(days=30)

dimensions = [
    "date",
    "sessionDefaultChannelGroup",
    "deviceCategory",
    "newVsReturning"
]

metrics = [
    "sessions",
    "transactions",
    "totalUsers",
    "totalRevenue",
    "engagementRate"
]

df = run_report(
    dimensions=dimensions,
    metrics=metrics,
    start_date=start_date,
    end_date=end_date,
    credentials_path="/path/to/credentials.json",
    property_id="YOUR_PROPERTY_ID"
)

# Process the data
print(f"Total sessions: {df['sessions'].sum()}")
print(f"Total revenue: ${df['totalRevenue'].sum():.2f}")
```

### Example 3: Pagination

```python
from ga4_reporter import GA4Reporter

reporter = GA4Reporter(
    property_id="YOUR_PROPERTY_ID",
    credentials_path="/path/to/credentials.json"
)

# Get first 10,000 rows
df_page1 = reporter.run_report(
    dimensions=["date", "pagePath"],
    metrics=["screenPageViews"],
    start_date="2024-01-01",
    end_date="2024-01-31",
    limit=10000,
    offset=0
)

# Get next 10,000 rows
df_page2 = reporter.run_report(
    dimensions=["date", "pagePath"],
    metrics=["screenPageViews"],
    start_date="2024-01-01",
    end_date="2024-01-31",
    limit=10000,
    offset=10000
)
```

## Requirements

- Python >= 3.7
- pandas >= 1.3.0
- numpy >= 1.21.0
- google-analytics-data >= 0.16.0

## License

MIT License

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## Support

For issues and questions, please use the [GitHub issue tracker](https://github.com/yourusername/ga4-reporter/issues).

## Changelog

### 0.1.0 (2024-01-XX)
- Initial release
- Basic GA4 reporting functionality
- Support for custom dimensions and metrics
- Pandas DataFrame output
