🌟 Platform Overview
The AI Image Platform is a comprehensive Python library and RESTful API that provides advanced AI-powered image processing, generation, editing, and multimodal chat capabilities. Built for serverless deployment with Flask, it integrates multiple AI providers for maximum flexibility and performance.
🤖
Multimodal Chat
Advanced chat with 6 Gemini AI models supporting both text and image inputs with conversation history.
🎨
Image Generation
High-quality text-to-image generation using Gemini AI and Pollinations.ai with multiple styles and aspect ratios.
🔍
Image Analysis
Intelligent image analysis, description, and understanding using advanced AI vision models.
✨
Image Editing
AI-powered image editing, transformation, and multi-image composition with natural language prompts.
☁️
Serverless Ready
Built for serverless deployment with Flask, supporting autoscale, VM, and scheduled deployment modes.
📚
Python Library
Clean, well-documented Python library with comprehensive classes and utility functions.
🚀 Quick Start
Installation
# Clone the repository
git clone https://github.com/your-repo/ai-image-platform.git
cd ai-image-platform
# Install dependencies
pip install -r requirements.txt
# Set environment variables
export GEMINI_API_KEY="your_gemini_api_key_here"
Running the Server
# Development server
python main.py
# Production server
gunicorn --bind=0.0.0.0:5000 --reuse-port main:app
🔑 API Key Required: You need a Google Gemini API key to use most features. Get one from the
Google AI Studio.
🌐 API Endpoints
GET
/api/health
Health Check
Monitor service status and API availability.
Response:
{
"status": "healthy",
"message": "AI Image Platform API is running",
"services": {
"gemini": {
"status": "healthy",
"gemini_api": "available"
},
"pollinations": {
"status": "healthy",
"pollinations_api": "available"
}
}
}
POST
/api/chat
Multimodal Chat
Chat with AI models supporting text and image inputs.
Request Body:
| Parameter |
Type |
Required |
Description |
| question |
string |
Required |
Your question or message |
| model |
string |
Optional |
Gemini model (default: gemini-2.5-flash-lite) |
| image_data |
string |
Optional |
Base64 encoded image data |
Available Models:
- gemini-2.5-flash-lite - Fastest (default)
- gemini-1.5-flash - High-speed lightweight
- gemini-2.5-flash - Advanced with excellent speed
- gemini-2.0-flash - Next-gen performance
- gemini-2.5-pro - Performance and quality
- gemini-1.5-flash-8b - 8B parameter lightweight
Example Request:
{
"question": "What's in this image?",
"model": "gemini-2.5-flash-lite",
"image_data": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQ..."
}
Example Response:
{
"status": "success",
"question": "What's in this image?",
"answer": "This image shows a beautiful sunset over mountains...",
"model_used": "gemini-2.5-flash-lite"
}
POST
/api/analyze-image
Image Analysis
Analyze images for detailed descriptions and understanding.
Request Body:
| Parameter |
Type |
Required |
Description |
| image_data |
string |
Optional* |
Base64 encoded image data |
| image_url |
string |
Optional* |
URL of image to analyze |
| extract_masks |
boolean |
Optional |
Extract segmentation masks (default: false) |
*Either image_data or image_url is required
Example Response:
{
"status": "success",
"analysis": "This image contains a modern city skyline...",
"image_format": "JPEG",
"image_size": [1920, 1080],
"segmentation_masks": []
}
POST
/api/generate-image
Gemini Image Generation
Generate high-quality images from text using Gemini AI.
Request Body:
| Parameter |
Type |
Required |
Description |
| prompt |
string |
Required |
Text description of desired image |
| style |
string |
Optional |
Artistic style (default: photorealistic) |
| aspect_ratio |
string |
Optional |
Image aspect ratio (default: 1:1) |
Available Styles:
- photorealistic - Photorealistic and detailed (default)
- cartoon - Cartoon style and animated
- abstract - Abstract art style
- impressionistic - Impressionist painting style
- cyberpunk - Cyberpunk and futuristic art
- anime - Anime and manga style
- oil_painting - Oil painting technique
- watercolor - Watercolor painting style
- sketch - Pencil sketch style
- digital_art - Digital art style
Available Aspect Ratios:
- 1:1 - Square format (default)
- 16:9 - Landscape widescreen
- 9:16 - Portrait vertical
- 4:3 - Standard landscape
- 3:4 - Standard portrait
Example Response:
{
"status": "success",
"image_base64": "iVBORw0KGgoAAAANSUhEUgAA...",
"response_text": "Generated a beautiful mountain landscape...",
"filename": "generated_image_20240905_143000_abc123.png",
"prompt": "A serene mountain landscape at sunset",
"style": "photorealistic",
"aspect_ratio": "16:9"
}
POST
/api/pollinations/generate-image
Pollinations.ai Image Generation
Generate high-quality images using Pollinations.ai Flux models (no API key required).
Request Body:
| Parameter |
Type |
Required |
Description |
| prompt |
string |
Required |
Text description of desired image |
| width |
integer |
Optional |
Image width (default: 1024) |
| height |
integer |
Optional |
Image height (default: 1024) |
| model |
string |
Optional |
Model to use (default: flux) |
| seed |
integer |
Optional |
Random seed for reproducible results |
| enhance |
boolean |
Optional |
Auto-enhance prompt (default: true) |
Available Models:
- flux - Standard Flux model (default)
- flux-realism - Photorealistic results
- flux-anime - Anime/manga style
- flux-3d - 3D rendered style
- any-dark - Dark artistic style
- flux-pro - Premium quality
POST
/api/edit-image
Image Editing
Edit and transform images using AI with natural language prompts.
Request Body:
| Parameter |
Type |
Required |
Description |
| image_data |
string |
Required |
Base64 encoded image data |
| edit_prompt |
string |
Required |
Description of desired edits |
| style |
string |
Optional |
Target artistic style |
| aspect_ratio |
string |
Optional |
Target aspect ratio |
🐍 Python Library Usage
Installation as Library
# Import the library
from ai_image_platform import (
GeminiChatClient,
ImageAnalyzer,
ImageGenerator,
ImageEditor,
PollinationsClient
)
# Or import specific functions
from ai_image_platform.core.image_generation import generate_image_from_text
from ai_image_platform.core.image_analysis import analyze_base64_image
Chat Client Example
# Initialize chat client
chat_client = GeminiChatClient()
# Text-only chat
response = chat_client.ask_question(
"Explain quantum computing in simple terms",
selected_model="gemini-2.5-flash-lite"
)
print(response['answer'])
# Multimodal chat with image
with open('image.jpg', 'rb') as f:
image_data = base64.b64encode(f.read()).decode()
response = chat_client.ask_question(
"What's in this image?",
selected_model="gemini-2.5-flash",
image_data=image_data
)
print(response['answer'])
Image Generation Example
# Using the class
generator = ImageGenerator()
result = generator.generate_image(
prompt_text="A majestic dragon in a mystical forest",
style="digital_art",
aspect_ratio="16:9"
)
if result['status'] == 'success':
# Save the generated image
image_data = base64.b64decode(result['image_base64'])
with open(result['filename'], 'wb') as f:
f.write(image_data)
# Using convenience function
result = generate_image_from_text(
"A peaceful zen garden with cherry blossoms",
style="watercolor",
aspect_ratio="4:3"
)
Pollinations Client Example
# Initialize Pollinations client (no API key required)
pollinations = PollinationsClient()
# Generate image
result = pollinations.generate_image(
prompt="A futuristic cityscape at night",
width=1024,
height=576,
model="flux-realism",
seed=12345
)
if result['status'] == 'success':
print(f"Generated: {result['filename']}")
# image_data is in result['image_base64']
Image Analysis Example
# Using the class
analyzer = ImageAnalyzer()
# Analyze from file
with open('photo.jpg', 'rb') as f:
image_bytes = f.read()
analysis = analyzer.analyze_image_bytes(
image_bytes,
prompt="Describe this image in detail"
)
print(analysis)
# Using convenience function for base64 image
result = analyze_base64_image(
base64_image_data,
extract_masks=True
)
print(result['analysis'])
Flask App Integration
from ai_image_platform.api import create_app
# Create Flask app
app = create_app()
# Custom configuration
config = {
'SECRET_KEY': 'your-secret-key',
'MAX_CONTENT_LENGTH': 32 * 1024 * 1024 # 32MB
}
app = create_app(config)
# Run the app
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000, debug=True)
💡 Usage Examples
JavaScript/Frontend Integration
// Chat with image upload
async function chatWithImage(question, imageFile, model = 'gemini-2.5-flash-lite') {
// Convert image to base64
const reader = new FileReader();
const imageData = await new Promise(resolve => {
reader.onload = e => resolve(e.target.result);
reader.readAsDataURL(imageFile);
});
const response = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
question: question,
model: model,
image_data: imageData
})
});
return await response.json();
}
// Generate image
async function generateImage(prompt, style = 'photorealistic') {
const response = await fetch('/api/generate-image', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
prompt: prompt,
style: style,
aspect_ratio: '16:9'
})
});
const result = await response.json();
if (result.status === 'success') {
// Display image
const img = document.createElement('img');
img.src = 'data:image/png;base64,' + result.image_base64;
document.body.appendChild(img);
}
return result;
}
Python Script Examples
#!/usr/bin/env python3
"""
Example: Batch Image Analysis
"""
import os
import glob
from ai_image_platform import ImageAnalyzer
def analyze_image_folder(folder_path):
analyzer = ImageAnalyzer()
# Supported formats
image_extensions = ['*.jpg', '*.jpeg', '*.png', '*.bmp', '*.gif']
for ext in image_extensions:
for image_path in glob.glob(os.path.join(folder_path, ext)):
print(f"Analyzing: {image_path}")
with open(image_path, 'rb') as f:
image_bytes = f.read()
try:
analysis = analyzer.analyze_image_bytes(image_bytes)
print(f"Analysis: {analysis[:200]}...\n")
except Exception as e:
print(f"Error: {e}\n")
if __name__ == '__main__':
analyze_image_folder('./images')
cURL Examples
# Health check
curl -X GET http://localhost:5000/api/health
# Chat request
curl -X POST http://localhost:5000/api/chat \
-H "Content-Type: application/json" \
-d '{
"question": "What is the capital of France?",
"model": "gemini-2.5-flash-lite"
}'
# Generate image with Pollinations.ai
curl -X POST http://localhost:5000/api/pollinations/generate-image \
-H "Content-Type: application/json" \
-d '{
"prompt": "A beautiful sunset over the ocean",
"width": 1024,
"height": 576,
"model": "flux-realism"
}'
# Analyze image from URL
curl -X POST http://localhost:5000/api/analyze-image \
-H "Content-Type: application/json" \
-d '{
"image_url": "https://example.com/image.jpg"
}'
🚀 Serverless Deployment
Replit Deployment
Autoscale Mode: Perfect for stateless image processing APIs that scale automatically based on demand.
{
"deployment_target": "autoscale",
"run": ["gunicorn", "--bind=0.0.0.0:5000", "--reuse-port", "main:app"],
"build": null
}
Environment Variables
| Variable |
Required |
Description |
| GEMINI_API_KEY |
Required |
Google Gemini API key for AI features |
| SECRET_KEY |
Optional |
Flask secret key (auto-generated if not set) |
Production Configuration
# production_config.py
import os
class ProductionConfig:
SECRET_KEY = os.environ.get('SECRET_KEY') or 'production-secret-key'
GEMINI_API_KEY = os.environ.get('GEMINI_API_KEY')
MAX_CONTENT_LENGTH = 16 * 1024 * 1024 # 16MB
# Logging configuration
LOGGING = {
'version': 1,
'handlers': {
'default': {
'class': 'logging.StreamHandler',
'formatter': 'default'
}
},
'formatters': {
'default': {
'format': '[%(asctime)s] %(levelname)s in %(module)s: %(message)s',
}
},
'root': {
'level': 'INFO',
'handlers': ['default']
}
}
# main.py
from ai_image_platform.api import create_app
from production_config import ProductionConfig
app = create_app(ProductionConfig.__dict__)
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
⚠️ API Rate Limits: Be mindful of Google Gemini API rate limits and quotas in production. Implement appropriate error handling and retry logic.
✅ Serverless Benefits: The platform is designed for serverless deployment with no file system dependencies, using base64 for all image I/O operations.