Metadata-Version: 2.2
Name: gmicloud
Version: 0.1.1
Summary: GMI Cloud Python SDK
Author-email: GMI <gmi@gmitec.net>
License: MIT
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown

# GMICloud SDK

## Overview

The GMI Inference Engine SDK provides a Python interface for deploying and managing machine learning models in
production environments. It allows users to create model artifacts, schedule tasks for serving models, and call
inference APIs easily.

This SDK streamlines the process of utilizing GMI Cloud capabilities such as deploying models with Kubernetes-based Ray
services, managing resources automatically, and accessing model inference endpoints. With minimal setup, developers can
focus on building ML solutions instead of infrastructure.

## Features

- Artifact Management: Easily create, update, and manage ML model artifacts.
- Task Management: Quickly create, schedule, and manage deployment tasks for model inference.
- Usage Data Retrieval : Fetch and analyze usage data to optimize resource allocation.

## Installation

To install the SDK, use pip:

```bash
pip install gmicloud
```

## Setup

You must configure authentication credentials for accessing the GMI Cloud API. There are two ways to configure the SDK:

### Option 1: Using Environment Variables

Set the following environment variables:

```shell
export GMI_CLOUD_CLIENT_ID=<YOUR_CLIENT_ID>
export GMI_CLOUD_EMAIL=<YOUR_EMAIL>
export GMI_CLOUD_PASSWORD=<YOUR_PASSWORD>
```

### Option 2: Passing Credentials as Parameters

Pass `client_id`, `email`, and `password` directly to the Client object when initializing it in your script:

```python
from gmicloud import Client

client = Client(client_id="<YOUR_CLIENT_ID>", email="<YOUR_EMAIL>", password="<YOUR_PASSWORD>")
```

## Quick Start

### 1. Create a Task from an Artifact Template

This is the simplest example to deploy an existing artifact template:

```python
from datetime import datetime
from gmicloud import Client, TaskScheduling, OneOffScheduling
from examples.completion import call_chat_completion

# Initialize the client
client = Client()

# Schedule and start a task from an artifact template
task = client.create_task_from_artifact_template(
    "llama31_8b_template_001",
    TaskScheduling(
        scheduling_oneoff=OneOffScheduling(
            trigger_timestamp=int(datetime.now().timestamp()) + 60,  # Delay by 1 min
            min_replicas=1,
            max_replicas=10,
        )
    )
)

# Make a chat completion request via the task endpoint
response = call_chat_completion(client, task.task_id)
print(response)
```

### 2. Step-by-Step Example: Create Artifact, Task, and Query the Endpoint

#### (a) Create an Artifact from a Template

First, you’ll retrieve all templates and create an artifact based on the desired template (e.g., "Llama3.1 8B"):

```python
def create_artifact_from_template(client):
    artifact_manager = client.artifact_manager

    # List all available templates
    templates = artifact_manager.get_artifact_templates()
    for template in templates:
        if template.artifact_name == "Llama3.1 8B":
            return artifact_manager.create_artifact_from_template(
                artifact_template_id=template.artifact_template_id
            )
    return None
```

#### (b) Create a Task from the Artifact

Wait until the artifact becomes "ready" and then deploy it using task scheduling:

```python
def create_task_and_start(client, artifact_id):
    artifact_manager = client.artifact_manager

    # Wait until the artifact is ready
    while True:
        artifact = artifact_manager.get_artifact(artifact_id)
        if artifact.build_status == "SUCCESS":
            break
        print("Waiting for artifact to be ready...")
        time.sleep(2)

    # Configure and start the task
    task_manager = client.task_manager
    task = task_manager.create_task(Task(
        config=TaskConfig(
            ray_task_config=RayTaskConfig(
                ray_version="latest-py311-gpu",
                file_path="serve",
                artifact_id=artifact_id,
                deployment_name="app",
                replica_resource=ReplicaResource(
                    cpu=24,
                    ram_gb=128,
                    gpu=2,
                ),
            ),
            task_scheduling=TaskScheduling(
                scheduling_oneoff=OneOffScheduling(
                    trigger_timestamp=int(datetime.now().timestamp()) + 60,
                    min_replicas=1,
                    max_replicas=10,
                )
            ),
        ),
    ))

    task_manager.start_task(task.task_id)
    return task.task_id
```

### (c) Query the Model Endpoint

Once the task is ready, use the endpoint for inference:

```python
from examples.completion import call_chat_completion

client = Client()
artifact_id = create_artifact_from_template(client)
task_id = create_task_and_start(client, artifact_id)

response = call_chat_completion(client, task_id)
print(response)
```

## API Reference

### Client

Represents the entry point to interact with GMI Cloud APIs.
Client(
client_id: Optional[str] = "",
email: Optional[str] = "",
password: Optional[str] = ""
)

### Artifact Management

* get_artifact_templates(): Fetch a list of available artifact templates.
* create_artifact_from_template(template_id: str): Create a model artifact from a given template.
* get_artifact(artifact_id: str): Get details of a specific artifact.

### Task Management

* create_task_from_artifact_template(template_id: str, scheduling: TaskScheduling): Create and schedule a task using an
  artifact template.
* start_task(task_id: str): Start a task.
* get_task(task_id: str): Retrieve the status and details of a specific task.

## Notes & Troubleshooting

Ensure Credentials are Correct: Double-check your environment variables or parameters passed into the Client object.
Artifact Status: It may take a few minutes for an artifact or task to transition to the "ready" state.
Inference Endpoint Readiness: Use the task endpoint only after the task status changes to "ready".
Default OpenAI Key: By default, the OpenAI API base URL is derived from the endpoint provided by GMI.

## Contributing

We welcome contributions to enhance the SDK. Please follow these steps:

1. Fork the repository.
2. Create a new branch for your feature or bugfix.
3. Commit changes with clear messages.
4. Submit a pull request for review.
