Metadata-Version: 2.4
Name: mechanex
Version: 1.0.2
Summary: A Python client for the Axionic API.
Home-page: https://axioniclabs.ai
Author: Axionic Labs
Author-email: contact@axioniclabs.ai
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.20.0
Requires-Dist: ipython
Requires-Dist: jupyter
Requires-Dist: matplotlib
Requires-Dist: click
Requires-Dist: fastapi
Requires-Dist: uvicorn
Requires-Dist: torch
Requires-Dist: numpy
Requires-Dist: tqdm
Requires-Dist: mdmm
Requires-Dist: rich
Requires-Dist: sae-lens
Requires-Dist: openai
Requires-Dist: huggingface_hub
Requires-Dist: jsonschema
Requires-Dist: pyyaml
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Mechanex

Mechanex allows you to optimize, align, and correct your generative AI models in production. It acts as a runtime control layer for small models.

The core promise is:
**improve model behavior at inference time through policies, without retraining.**

## Product Direction

Mechanex is built around a policy-first workflow:

1. Choose a model (local, self-hosted, or hosted).
2. Choose a task profile.
3. Choose an objective.
4. Apply runtime controls (sampling, steering, constraints, verifiers, optimization).
5. Compare and evaluate policies.
6. Deploy and iterate.

## Core Concepts

### Policy
A policy is the reusable runtime object. It defines:
- Sampling method and search settings.
- Steering settings.
- Output constraints.
- Verifier stack.
- Optimization and fallback behavior.

### Execution Modes
Mechanex supports flexible execution via `mx.set_execution_mode()`:
- `auto`: Remote when authenticated, local when not authenticated.
- `remote`: Force hosted inference (account + API key required).
- `local`: Force local model execution on your own hardware.

## Installation

```bash
pip install mechanex
```

## Quick Start

### 1. Initialize the Client
You must have an API key to use mechanex's hosted mode. You can create an API key through the CLI. 

**New Users:**
Run the signup command to create an account, automatically log in, and generate your first API key:
```bash
mechanex signup
```

**Existing Users:**
If you already have an account, log in and generate a key manually:
```bash
mechanex login
mechanex create-api-key
```

**Using the Key in Python:**
The client automatically loads the key generated by the CLI.
```python
import mechanex as mx

# You can manually set your key. If persist=True, it is saved for future sessions.
mx.set_key("your-api-key-here", persist=True)

# Choose your execution mode
mx.set_execution_mode("remote") # or "local", or "auto"
```

### 2. Setting the Model
You can specify the model depending on your execution mode.

#### Hosted Remote Model Catalog
```python
mx.set_model("qwen3-0.6b")
```
Supported hosted models:

| Family | Models |
| :--- | :--- |
| **Gemma 2** | `gemma-2-27b`, `gemma-2-2b`, `gemma-2-9b`, `gemma-2-9b-it`, `gemma-2b`, `gemma-2b-it` |
| **Gemma 3** | `gemma-3-12b-it`, `gemma-3-12b-pt`, `gemma-3-1b-it`, `gemma-3-1b-pt`, `gemma-3-270m`, `gemma-3-270m-it`, `gemma-3-27b-it`, `gemma-3-27b-pt`, `gemma-3-4b-it`, `gemma-3-4b-pt` |
| **Llama** | `llama-3.1-8b`, `llama-3.1-8b-instruct`, `llama-3.3-70b-instruct`, `meta-llama-3-8b-instruct` |
| **Qwen** | `qwen2.5-7b-instruct`, `qwen3-0.6b`, `qwen3-1.7b`, `qwen3-14b`, `qwen3-4b`, `qwen3-8b` |
| **Other** | `deepseek-r1-distill-llama-8b`, `gpt-oss-20b`, `gpt2-small`, `mistral-7b`, `pythia-70m-deduped` |

#### Local Model Management
To load models locally for inspection and low-latency hooks:
```python
mx.load("gpt2-small") # Uses transformer-lens to load the model locally
mx.set_execution_mode("local")
```
To free up GPU memory and switch back to remote execution flow:
```python
mx.unload()
```

## Policies and Runtime Controls

Mechanex allows you to control generation using reusable policies or direct API parameters.
```python
import mechanex as mx

mx.load("gpt2-small")
mx.set_execution_mode("local")

# Define a policy with strict JSON constraints
policy = mx.policy.strict_json_extraction(
    schema={
        "type": "object",
        "required": ["summary"],
        "properties": {"summary": {"type": "string"}},
    },
    name="strict_json_small_v1",
)

res = mx.policy.run(
    prompt="Summarize speculative decoding in one sentence.",
    policy=policy,
    include_trace=True,
)
print(res["output"])
```

### Sampling Methods
You can control generation using various sampling methods to prioritize creativity vs output determinism.
Supported methods:
- `greedy` (fastest, deterministic)
- `top-k` (more creative)
- `top-p` (balanced creativity)
- `min-p`
- `typical`
- `ads` (Adaptive Determinantal Sampling) — **Remote only**
- `constrained-beam-search`
- `speculative-decoding`
- `ssd`
- `guided-generation`
- `ensemble-sampling`

## Steering Vectors

Steering vectors allow you to control the behavior of a model by injecting specific activation patterns. This method does not require any post-training or fine-tuning, mapping representations accurately.

### Generate and Apply Steering Vectors
```python
import mechanex as mx

# 1. Create a steering vector from contrastive examples
vector_id = mx.steering.generate_vectors(
    name="honesty_vector",       # Supports naming and labeling your steering vectors!
    prompts=["I tell the", "My statement is", "The truth is"],
    positive_answers=[" truth", " factual", " correct"],
    negative_answers=[" lie", " false", " wrong"],
    method="caa"                 # Options: "caa", "few-shot", "steering-perceptrons" (backend)
)
print(f"Generated vector ID: {vector_id}")

# 2. Save and load steering vectors for reuse
mx.steering.save_vectors(vector_id, "honesty_vector.json")
vectors = mx.steering.load_vectors("honesty_vector.json")

# 3. Apply the steering vector during generation
steered_output = mx.generation.generate(
    prompt="Do I tell lies? Answer:",
    max_tokens=20,
    steering_vector=vector_id,
    steering_strength=2.0
)
print(steered_output)
```

## SAE (Sparse Autoencoder) Pipeline
The SAE pipeline provides advanced behavioral detection and automatic correction. It uses locally-computed (`fwd_hooks`) or remote activation features.

### Create and Use Behaviors
```python
# 1. Create a behavior from contrastive examples
behavior = mx.sae.create_behavior(
    behavior_name="concise_mode",
    prompts=["Describe how to optimize model inference."],
    positive_answers=["Use caching, quantization, and batching."],
    negative_answers=["A very long and unstructured answer about how you should take your time explaining things..."],
    description="Encourages concise style"
)

# 2. Generate with SAE behavior Steering
sae_steered = mx.sae.generate(
    prompt="Tell me about that. Answer:",
    behavior_names=["concise_mode"],
    max_new_tokens=30,
)
print(sae_steered)
```

You can also load behavior datasets automatically using `create_behavior_from_jsonl("toxicity", "tests/toxicity_dataset.jsonl")`.

## Comparing and Evaluating Policies

The SDK makes it trivial to compare multiple iterations of runtime policies.

```python
pid = mx.policy.save(policy)
cmp = mx.policy.compare(
    prompt="Extract order_id and status from text.",
    policies=[mx.policy.fast_tool_router(), policy],
)

ev = mx.policy.evaluate(
    prompts=[
        "Extract {name, role} from: Alice is CTO.",
        "Extract {name, role} from: Bob is PM.",
    ],
    policy_id=pid,
)
```

## Deployment & Serving

Mechanex can host an OpenAI-compatible server that leverages your locally loaded model or remote API. The system auto-applies specified behaviors and auto-correction.

```python
import mechanex as mx
mx.load("gpt2-small")
mx.set_execution_mode("local")

# Start server natively compatible with OpenAI client format
mx.serve(port=8001, corrected_behaviors=["toxicity"])

# Use vLLM integration for high-performance serving
# mx.serve(port=8001, use_vllm=True)
```

You can interact with the server using standard libraries like the `openai` Python SDK by routing traffic to `http://localhost:8001/v1`. Elements like `policy`, `policy_id`, steering vectors, and SAE behaviors can all be passed via standard `extra_body` configuration.

## Environment Capabilities

- `ADS` (Adaptive Determinantal Sampling) is **remote-only**.
- **Steering perceptrons** (`steering-perceptrons`) are **remote-only**.
- All other runtime policy mechanisms are available for local execution, with capability-aware fallback behavior.

## CLI Commands

Account profile and key lifecycle tools:
- `mechanex signup`
- `mechanex login`
- `mechanex whoami`
- `mechanex create-api-key`
- `mechanex list-api-keys`
- `mechanex balance`
- `mechanex topup`
- `mechanex logout`

## Examples

See [examples/README.md](examples/README.md) for runnable workflows within the repository, including:
- `01_local_first_quickstart.py`
- `02_remote_quickstart.py`
- `03_sampling_strategies.py`
- `04_strict_json_policy.py`
- `05_policy_compare_and_evaluate.py`
- `06_local_steering_vectors.py`
- `07_openai_compatible_server.py`
- `08_hybrid_local_remote_toggle.py`
- `09_sae_behavior_workflow.py`
- `10_remote_policy_smoke.py`

## Engineering Docs

- [Contributing](CONTRIBUTING.md)
- [Engineering Standards](docs/ENGINEERING_STANDARDS.md)
- [Testing Guide](docs/TESTING.md)
- [CI/CD](docs/CI_CD.md)
- [Operations Runbook](docs/OPERATIONS_RUNBOOK.md)
- [Release Process](docs/RELEASE_PROCESS.md)
