Metadata-Version: 2.4
Name: aimathkit
Version: 0.1.0
Summary: Practical AI math equations as usable Python functions.
Home-page: https://github.com/TangibleResearch/AIMathKit
Author: Tangible Research
Author-email: reyaanshsinha4@gmail.com
License: MIT
Project-URL: Source, https://github.com/TangibleResearch/AIMathKit
Project-URL: Issues, https://github.com/TangibleResearch/AIMathKit/issues
Keywords: ai,math,numpy,transformers,machine-learning,quantization,hardware,education
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Education
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: project-url
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# AI Math Kit

AI Math Kit is a lightweight Python package that turns common AI training, inference, optimization, architecture, quantization, and hardware equations into simple usable NumPy functions. It is designed for learning, prototyping, and understanding the math behind neural networks and Transformers.

## Installation

```bash
pip install .
```

## Example Usage

```python
from aimathkit import relu, softmax, kv_cache_size, bytes_to_mb

print(relu([-2, -1, 0, 1, 2]))
print(softmax([2.0, 1.0, 0.1]))

cache = kv_cache_size(
    batch=1,
    seq_len=2048,
    layers=24,
    heads=16,
    head_dim=64,
    bytes_per_value=2,
)

print(bytes_to_mb(cache))
```

## Module Overview

- `activations.py`: ReLU, sigmoid, tanh, GELU, softmax, and activation application.
- `layers.py`: Linear layers, weighted sums, residuals, and vocabulary logits.
- `losses.py`: Cross entropy, regression losses, batch loss, and language model loss.
- `optimizers.py`: Gradient descent, Adam, regularization, dropout, clipping, schedules, and averaging.
- `normalization.py`: Layer normalization, batch normalization, feature mean, and feature variance.
- `attention.py`: Query/key/value projections, attention scores, masks, scaled dot-product attention, multi-head helpers, and Transformer block helpers.
- `embeddings.py`: Embedding lookup, positional embeddings, vector similarity, distances, and retrieval.
- `metrics.py`: Accuracy, error rate, precision, recall, F1, perplexity, log probabilities, and token/batch counts.
- `decoding.py`: Temperature scaling, greedy decode, top-k/top-p filtering, renormalization, and repetition penalty.
- `vision.py`: Valid convolution, pooling, patch embeddings, and pixel normalization.
- `reinforcement.py`: Reward sums, discounted returns, value/Q updates, advantages, and policy loss.
- `quantization.py`: Affine and symmetric quantization, integer dot product, LoRA, sparsity, pruning, and compression.
- `hardware.py`: KV cache, activation memory, FLOPs, training compute, parameter memory, throughput, latency, bandwidth, utilization, and cost.
- `design.py`: Parameter counts for linear layers, attention, MLPs, Transformer blocks, and full Transformers.
- `utils.py`: Byte conversions, safe division, and NumPy conversion.

## Function List

### Activations

`apply_activation`, `relu`, `sigmoid`, `tanh_activation`, `gelu`, `softmax`

### Layers

`linear_layer`, `matrix_multiply_layer`, `weighted_sum`, `residual_add`, `logits_from_hidden`

### Losses

`cross_entropy`, `mean_squared_error`, `mean_absolute_error`, `mini_batch_loss`, `language_model_loss`, `scale_loss`

### Optimizers

`gradient_descent_update`, `weight_update`, `bias_update`, `average_gradient`, `momentum_update`, `adam_first_moment`, `adam_second_moment`, `adam_update`, `weight_decay`, `l2_regularization_loss`, `l1_regularization_loss`, `dropout_mask`, `apply_dropout`, `scaled_dropout`, `clip_gradient`, `learning_rate_warmup`, `linear_lr_decay`, `cosine_lr_decay`, `exponential_moving_average`, `running_average_loss`, `unscale_gradient`, `cast_precision`, `parameter_average`, `accumulate_gradients`, `average_accumulated_gradient`

### Normalization

`layer_norm`, `feature_mean`, `feature_variance`, `batch_norm`

### Attention

`query_projection`, `key_projection`, `value_projection`, `attention_scores`, `scaled_attention_scores`, `attention_weights`, `attention_output`, `scaled_dot_product_attention`, `causal_mask`, `apply_attention_mask`, `multi_head_attention_split`, `multi_head_concat`, `feed_forward_network`, `pre_norm_attention_block`, `next_token_probability`

### Embeddings

`embedding_lookup`, `add_positional_embeddings`, `dot_similarity`, `cosine_similarity`, `vector_norm`, `normalize_vector`, `euclidean_distance`, `squared_distance`, `retrieval_score`, `top_k_retrieval`

### Metrics

`accuracy`, `error_rate`, `precision`, `recall`, `f1_score`, `perplexity`, `sequence_log_probability`, `average_log_probability`, `tokens_per_batch`, `steps_per_epoch`, `total_tokens`, `effective_batch_size`

### Decoding

`temperature_scale_logits`, `greedy_decode`, `top_k_filter`, `top_p_filter`, `renormalize_probs`, `repetition_penalty`

### Vision

`conv2d_single_channel`, `conv2d_multi_channel`, `max_pool2d`, `avg_pool2d`, `image_patch_embedding`, `pixel_normalize`

### Reinforcement

`reward_sum`, `discounted_return`, `value_update`, `q_update`, `advantage`, `policy_probability`, `policy_loss`

### Design

`linear_layer_params`, `attention_params`, `mlp_params`, `transformer_block_params`, `transformer_params`

### Hardware

`kv_cache_size`, `activation_memory`, `matmul_flops`, `training_compute_estimate`, `parameter_memory`, `tokens_per_second`, `latency_per_token`, `utilization`, `memory_bandwidth_time`, `arithmetic_intensity`, `cache_hit_rate`, `cache_miss_rate`, `tokens_per_dollar`

### Quantization

`quantize_affine`, `dequantize_affine`, `quantize_symmetric`, `quantization_scale`, `int8_dot_product`, `low_rank_factorization_reconstruct`, `lora_update`, `apply_sparsity_mask`, `prune_small_weights`, `compression_ratio`

### Utils

`bytes_to_kb`, `bytes_to_mb`, `bytes_to_gb`, `safe_divide`, `ensure_numpy`

## Educational Note

AI Math Kit is educational and practical for small prototypes. It is not meant to replace NumPy, PyTorch, TensorFlow, JAX, or other production machine-learning frameworks.

## License

MIT License. See `LICENSE` for details.
