Metadata-Version: 2.4
Name: safari_sdk
Version: 2.122.0
Summary: Safari SDK: the SDK for Google DeepMind Gemini Robotics models
Author-Email: Safari SDKAuthors <safari-sdk-authors@google.com>
License-Expression: Apache-2.0
License-File: LICENSE
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Intended Audience :: Science/Research
Project-URL: homepage, https://deepmind.google/models/gemini-robotics/
Project-URL: repository, https://github.com/google-deepmind/gemini-robotics-sdk
Requires-Python: <3.14,>=3.11
Requires-Dist: absl-py
Requires-Dist: dataclasses_json
Requires-Dist: dm-env
Requires-Dist: evdev; sys_platform != "win32"
Requires-Dist: fastapi
Requires-Dist: gdm-robotics
Requires-Dist: google-api-python-client
Requires-Dist: google-auth-httplib2
Requires-Dist: google-auth-oauthlib
Requires-Dist: google-genai>=1.70.0
Requires-Dist: grpcio
Requires-Dist: imageio
Requires-Dist: immutabledict
Requires-Dist: jax
Requires-Dist: lark~=1.2
Requires-Dist: mcap-protobuf-support
Requires-Dist: mediapy
Requires-Dist: mujoco>=3.3.6
Requires-Dist: opencv-python
Requires-Dist: overrides
Requires-Dist: protobuf<7
Requires-Dist: psutil
Requires-Dist: python-magic
Requires-Dist: pytz
Requires-Dist: scipy
Requires-Dist: tensorflow
Requires-Dist: uvicorn
Requires-Dist: watchdog
Provides-Extra: dev
Requires-Dist: immutabledict; extra == "dev"
Requires-Dist: parameterized; extra == "dev"
Requires-Dist: pyink; extra == "dev"
Requires-Dist: pylint>=2.6.0; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-asyncio; extra == "dev"
Requires-Dist: pytest-xdist; extra == "dev"
Provides-Extra: cc-logging
Requires-Dist: safari-sdk-logging; extra == "cc-logging"
Description-Content-Type: text/markdown

# Safari SDK: the SDK for Google DeepMind Gemini Robotics models 🦓🦄🐘🐒🐍

## Disclaimer

This is not an officially supported Google product.

Safari SDK provides full lifecycle toolings necessary for using Gemini Robotics
models, including but not limited to, access checkpoint, serving a model,
evaluate the model on robot and in sim, upload data, finetuning the model,
download the finetuned checkpoint, etc. Most of the functionality requires you
to join Gemini Robotics Trusted Tester Program to use. See details in Gemini
Robotics [main page](https://deepmind.google/models/gemini-robotics/).

## Source Code

The source code can be found in
[GitHub](https://github.com/google-deepmind/gemini-robotics-sdk).

Note: Unless stated otherwise, the commands below assume that your working
directory is the root of your repository clone (ie, the same directory which
contains the `pyproject.toml` file).

### Code Structure

The safari-sdk is structured and distributed as a Python pip package, along with
1 internal dependency:

-   safari-sdk: This is the root level package, and the only one that users
    should directly install or use. It is a pure-python package.
-   safari-sdk-logging: This is an internal dependency which contains the C++
    and pybind11 logging code.

Both packages place their importable files into the safari_sdk top-level
namespace, according to
[PEP 420 – Implicit Namespace Packages](https://peps.python.org/pep-0420/)

### Why Separate Packages?

The safari-sdk package is often installed from source code, for development and
debugging. However, C++ build dependencies take a long time (30+ minutes) to
compile from source. Putting these in separate packages allows prebuilt versions
of those to be used when the (pure-python) base safari-sdk package is installed
from source, which in turn reduces the time needed to build and install that
package to a few seconds (at least in cases where there are no changes to the
C++ logging code).

## Installing the SDK

### Installing Pip Dependencies

The pyproject.toml file specifies the dependencies for the Safari SDK, including
allowed version ranges for some packages. The requirements.txt file specifies
exact versions and hashes of each of those dependencies; internal unit tests use
these versions, and installing them in a virtual environment before installing
the SDK is recommended to avoid failures due to upstream package changes:

```shell
pip install -r requirements.txt
```

### Updating Pip Dependencies

A script is provided which updates the requirements.txt file to the most recent
versions of dependencies which satisfy the constraints in the pyproject.toml
file. This allows those updates to be done at discrete times (rather than
whenever those updates are pushed to PyPi) and tested in isolation before being
pushed to all users. To run that script:

```shell
scripts/update_pip_dependencies.sh
```

### Installing from PyPi

The Safari SDK can be easily installed via PyPI.

```shell
pip install safari-sdk
```

### Installing from Source Code

The Safari SDK can also be installed from the source code.

```shell
pip install -e .[dev]
```

## Building the Wheel

To build a Python wheel, run the following command from the root of the
repository.

```shell
scripts/build_wheel.sh
```

This script will build a pip installable wheel for the Safari SDK, and print the
file's path to stdout.

## Model support

Safari SDK aims to support all models in the Gemini Robotics model series.

Trusted Testers can access the Gemini Robotics On Device model from SDK v2.4.1.

## Libraries

Libraries related to robot data logging is in `safari_sdk/logging`.

Libraries related to model inference and interface with model servers are in
`safari_sdk/model`.

Libraries and binary related to accessing model checkpoints, upload data and
request of model finetune can be found in `safari_sdk/flywheel`.

Examples, including robot and simulation evaluation of models are in
`examples/`. Aloha specific eval code are in `examples/aloha`.

## Flywheel CLI

The flywheel CLI is a convenient CLI tool available after installation of the
pip package. It provides a set of commands to interact with the Gemini Robotics
platform, such as training models, serving models, managing data, and
downloading artifacts.

To use the CLI

```
flywheel-cli <command> [--flags] [--flags]
```

Supported commands are:

*   `train`: Train a model. Requires specifying task ID, start date, and end
    date.
*   `serve`: Serve a model. Requires specifying the training job ID.
*   `list`: List available training jobs.
*   `list_serve`: List available serving jobs.
*   `data_stats`: Show data statistics available for training.
*   `download`: Download artifacts from a training job or a specific artifact
    ID.
*   `upload_data`: Upload data to the data ingestion service.
*   `version`: Show the version of the SDK.
*   `help`: Show this help message with all the available commands and flags.

## Agent

The Safari SDK includes a comprehensive agent framework for building interactive
robotics agents powered by Gemini models. See
[YouTube Video: Gemini Robotics 1.5: Using agentic capabilities](https://youtu.be/AMRxbIO04kQ?si=UFILQ9IOgfw7RTus).
The framework is located in `safari_sdk/agent/framework` and provides a modular
architecture for creating agents that can perceive their environment, reason
about tasks, and control robot hardware.

### Key Components

**Agents** (`safari_sdk/agent/framework/agents/`): Base agent classes that integrate
with the Gemini Live API to provide conversational interaction and tool use
capabilities.

**Embodiments** (`safari_sdk/agent/framework/embodiments/`): Hardware-specific
interfaces that connect agents to physical robot systems (e.g., Aloha robot).
Each embodiment provides tools for robot control.

**Tools** (`safari_sdk/agent/framework/tools/`): Modular capabilities that agents
can use, including:

*   Run instruction
*   Success detection
*   Scene description
*   etc.

**Event Bus** (`safari_sdk/agent/framework/event_bus/`): Asynchronous
publish-subscribe system for communication between agent components.

**Configuration** (`safari_sdk/agent/framework/config.py`): Centralized
configuration management using `AgentFrameworkConfig`, supporting both
programmatic configuration and flag-based setup.

### Aloha Agent Example

The `examples/aloha/agent/` directory contains agent implementations for the
Aloha robot platform.

The primary example is `simple_agent.py`, which provides a conversational agent
that can control the Aloha robot using natural language instructions.

To run the Aloha agent, use the provided `run.py` script:

```shell
python examples/aloha/agent/run.py --agent_name=simple_agent
```

The Aloha agent demonstrates integration of vision-based robot control,
multi-camera perception, and conversational interaction with Gemini models.

Alternatively, you build your own agent using the agent framework.

The codebase is still in active development. We will update our most updated
user guide with Trusted Testers of Gemini Robotics.
