Metadata-Version: 2.4
Name: medgemma
Version: 0.1.1
Summary: Medical AI on Apple Silicon – MedGemma 1.5 4B via MLX
Author: chiboko
License-Expression: MIT
Keywords: ai,apple-silicon,medgemma,medical,mlx
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Healthcare Industry
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: MacOS
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: click>=8.0
Requires-Dist: mlx-vlm>=0.3.10
Provides-Extra: dev
Requires-Dist: pytest-mock; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Description-Content-Type: text/markdown

# medgemma

**Medical AI on Apple Silicon — MedGemma 1.5 4B via MLX**

[![PyPI version](https://img.shields.io/pypi/v/medgemma)](https://pypi.org/project/medgemma/)
[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue)](https://python.org)
[![Apple Silicon](https://img.shields.io/badge/platform-Apple%20Silicon-black?logo=apple)](https://support.apple.com/en-us/116943)
[![License: MIT](https://img.shields.io/badge/license-MIT-green)](LICENSE)

> [!WARNING]
> **Medical Disclaimer** — This tool is for **informational and educational purposes only**. It is NOT a substitute for professional medical advice, diagnosis, or treatment. Always consult a qualified healthcare provider for medical decisions. Never disregard professional medical advice because of something generated by this tool.

## What is MedGemma?

MedGemma is a command-line tool and Python library that runs Google's [MedGemma 1.5 4B](https://huggingface.co/google/medgemma-4b-it) medical AI model locally on your Mac. It uses Apple's [MLX framework](https://ml-explore.github.io/mlx/) to run entirely on your Apple Silicon GPU — no cloud API, no data leaves your machine. Ask medical questions, analyze medical images, and get evidence-based responses, all from your terminal.

## Requirements

- **Apple Silicon Mac** (M1, M2, M3, or M4)
- **Python 3.10** or newer
- **~4 GB disk space** for the quantized model weights
- **macOS** (the only supported platform)

## Quick Start

### 1. Install

```bash
pip install medgemma
```

Or with [uv](https://docs.astral.sh/uv/):

```bash
uv pip install medgemma
```

### 2. Hugging Face authentication

The model weights are hosted on Hugging Face under [google/medgemma-4b-it](https://huggingface.co/google/medgemma-4b-it). Before downloading, you need to:

1. Create a [Hugging Face account](https://huggingface.co/join) (free)
2. Visit the [model page](https://huggingface.co/google/medgemma-4b-it) and accept Google's license agreement
3. Log in locally:

```bash
pip install huggingface-hub
huggingface-cli login
```

You only need to do this once.

### 3. Download the model

```bash
medgemma setup
```

This downloads the MedGemma 4B model from Hugging Face, converts it to 4-bit quantized MLX format, and caches it at `~/.medgemma/model`. You only need to do this once.

### 4. Ask a question

```bash
medgemma ask "What are the common symptoms of type 2 diabetes?"
```

Example output:

```
The common symptoms of type 2 diabetes include:

*   **Increased thirst (polydipsia):** You may feel thirsty more often than usual.
*   **Frequent urination (polyuria):** You may need to urinate more often,
    especially at night.
*   **Increased hunger (polyphagia):** You may feel hungry even after eating.
*   **Unexplained weight loss:** You may lose weight without trying.
*   **Fatigue:** You may feel tired and lacking energy.
*   **Blurred vision:** High blood sugar can affect the lenses of your eyes.
*   **Slow-healing sores or frequent infections:** High blood sugar can impair
    your body's ability to heal.
*   **Numbness or tingling in hands or feet:** This can be a sign of nerve
    damage (neuropathy).
*   **Areas of darkened skin:** Particularly in the armpits and neck
    (acanthosis nigricans).

It is important to note that many people with type 2 diabetes may not experience
any symptoms in the early stages. Regular check-ups and blood sugar screenings
are recommended, especially if you have risk factors.

**Disclaimer:** I am an AI assistant and cannot provide medical advice. Please
consult a healthcare professional for diagnosis and treatment.
```

## Image Analysis

Analyze medical images by passing `--image`:

```bash
medgemma ask "Describe this chest X-ray" --image chest_xray.png
```

Example output:

```
The chest X-ray shows the following findings:

*   **Heart size:** The heart appears to be within normal limits in size.
*   **Lungs:** The lung fields appear clear, without any obvious consolidation,
    effusion, or pneumothorax.
*   **Mediastinum:** The mediastinal contours appear normal.
*   **Bones:** No acute bony abnormalities are identified.

**Overall impression:** The chest X-ray appears unremarkable, with no acute
cardiopulmonary abnormality identified.

**Disclaimer:** I am an AI and this is not a radiological report. Please
consult a qualified radiologist for proper interpretation.
```

## CLI Reference

### `medgemma ask`

Send a prompt (and optional image) to the model.

```bash
medgemma ask PROMPT [OPTIONS]
```

| Option | Description |
|---|---|
| `--image PATH` | Path to an image file to analyze |
| `--max-tokens INT` | Maximum tokens to generate (default: 512) |
| `--temperature FLOAT` | Sampling temperature (default: 0.1) |
| `--model-path PATH` | Path to a local MLX model directory |
| `--json` | Output full response as JSON with stats |
| `--no-stream` | Disable streaming, print all at once |

### `medgemma setup`

Download and prepare the model.

```bash
medgemma setup [OPTIONS]
```

| Option | Description |
|---|---|
| `--local-path PATH` | Use an already-converted local model instead of downloading |
| `--force` | Re-download and overwrite existing cached model |

### `medgemma info`

Show model status and cache location.

```bash
medgemma info
```

Example output:

```
Cache directory: /Users/you/.medgemma/model
Model in cache:  yes
Model loaded:    no
```

### `medgemma --version`

Print the installed version.

## Python API

### Basic usage

```python
from medgemma import MedGemma

mg = MedGemma()
response = mg.ask("What are symptoms of diabetes?")
print(response.text)
```

### Image analysis

```python
response = mg.ask("Describe this X-ray", image="chest_xray.png")
print(response.text)
```

### Streaming

```python
for chunk in mg.stream("Explain hypertension"):
    print(chunk, end="", flush=True)
```

### Response object

`MedGemma.ask()` returns a `Response` dataclass with these fields:

| Field | Type | Description |
|---|---|---|
| `text` | `str` | The generated response text |
| `prompt_tokens` | `int` | Number of tokens in the prompt |
| `completion_tokens` | `int` | Number of tokens generated |
| `tokens_per_second` | `float` | Generation speed |
| `elapsed_seconds` | `float` | Total generation time |

```python
response = mg.ask("What is aspirin used for?")
print(response.text)
print(f"{response.completion_tokens} tokens in {response.elapsed_seconds:.1f}s")
print(f"Speed: {response.tokens_per_second:.1f} tok/s")
```

### Custom model path and parameters

```python
mg = MedGemma(
    model_path="/path/to/local/mlx-model",
    max_tokens=1024,
    temperature=0.3,
)
```

### Release model from memory

```python
mg.unload()
```

## JSON Output

Use `--json` to get structured output with generation stats:

```bash
medgemma ask "What is hypertension?" --json
```

```json
{
  "text": "Hypertension, also known as high blood pressure, is a chronic medical condition...",
  "completion_tokens": 248,
  "tokens_per_second": 32.5,
  "elapsed_seconds": 7.6
}
```

## How It Works

1. **Model download** — `medgemma setup` downloads Google's [MedGemma 1.5 4B-IT](https://huggingface.co/google/medgemma-4b-it) from Hugging Face.
2. **Quantization** — The model is converted to 4-bit quantized MLX format, reducing size from ~8 GB to ~4 GB while preserving quality.
3. **Local inference** — All inference runs on your Apple Silicon GPU via the [MLX](https://ml-explore.github.io/mlx/) framework. No data is sent to any server.
4. **Lazy loading** — The model loads into memory only on the first `ask()` or `stream()` call, and stays loaded for subsequent requests.

## Troubleshooting

### "Not running on Apple Silicon"

MedGemma requires an Apple Silicon Mac (M1/M2/M3/M4). It cannot run on Intel Macs or other platforms. The MLX framework only supports Apple's ARM-based chips.

### Model download fails

- Make sure you've accepted the license at [google/medgemma-4b-it](https://huggingface.co/google/medgemma-4b-it) and logged in with `huggingface-cli login`
- Check your internet connection
- Ensure you have ~4 GB of free disk space
- Try again with `medgemma setup --force`
- If you're behind a firewall, download the model manually and use `medgemma setup --local-path /path/to/model`

### Out of memory

The 4-bit quantized model needs approximately 4 GB of unified memory. If you're running low:

- Close other memory-intensive applications
- Use `--max-tokens` with a lower value to limit output length
- Call `mg.unload()` in Python when you're done to free memory

### Model loads slowly on first run

The first `ask` call loads the model into GPU memory, which can take several seconds. Subsequent calls reuse the loaded model and are much faster.

---

> [!WARNING]
> **Medical Disclaimer** — This tool is for **informational and educational purposes only**. It does not provide medical advice, diagnosis, or treatment. The outputs are generated by an AI model and may be inaccurate or incomplete. Always seek the advice of a qualified healthcare provider with any questions regarding a medical condition.
