Litefs 健康检查文档
Generated by TRAE SOLO at 2026-03-27
概述
Litefs 提供了完整的健康检查功能,用于监控服务的运行状态和就绪状态。健康检查功能通过中间件实现,支持自定义检查函数,返回 JSON 格式的检查结果。
功能特性
1. 健康检查端点
端点:
/health(可自定义)方法:GET
响应格式:JSON
响应示例(健康)
{
"status": "healthy",
"timestamp": 1711526400.123,
"checks": {
"database": {
"status": "pass",
"timestamp": 1711526400.123
},
"cache": {
"status": "pass",
"timestamp": 1711526400.123
}
}
}
响应示例(不健康)
{
"status": "unhealthy",
"timestamp": 1711526400.123,
"checks": {
"database": {
"status": "pass",
"timestamp": 1711526400.123
},
"cache": {
"status": "fail",
"timestamp": 1711526400.123
}
}
}
响应示例(错误)
{
"status": "unhealthy",
"timestamp": 1711526400.123,
"checks": {
"database": {
"status": "error",
"error": "Connection timeout",
"timestamp": 1711526400.123
}
}
}
2. 就绪检查端点
端点:
/health/ready(可自定义)方法:GET
响应格式:JSON
响应示例(就绪)
{
"status": "ready",
"timestamp": 1711526400.123,
"checks": {
"migrations": {
"status": "pass",
"timestamp": 1711526400.123
},
"config": {
"status": "pass",
"timestamp": 1711526400.123
}
}
}
响应示例(未就绪)
{
"status": "not_ready",
"timestamp": 1711526400.123,
"checks": {
"migrations": {
"status": "fail",
"timestamp": 1711526400.123
}
}
}
使用方法
基本使用
from litefs import Litefs
from litefs.middleware import HealthCheck
app = Litefs(webroot='./site')
app.add_middleware(HealthCheck, path='/health', ready_path='/health/ready')
app.run()
添加健康检查
def check_database():
"""检查数据库连接"""
try:
db.connect()
return True
except Exception:
return False
def check_cache():
"""检查缓存服务"""
return cache.is_connected()
def check_disk_space():
"""检查磁盘空间"""
import shutil
total, used, free = shutil.disk_usage('.')
return free > 1024 * 1024 * 1024 # 至少 1GB 可用空间
app.add_health_check('database', check_database)
app.add_health_check('cache', check_cache)
app.add_health_check('disk_space', check_disk_space)
添加就绪检查
def check_migrations():
"""检查数据库迁移"""
return migration_status.is_complete()
def check_config():
"""检查配置加载"""
return config.is_loaded()
app.add_ready_check('migrations', check_migrations)
app.add_ready_check('config', check_config)
自定义端点路径
app.add_middleware(
HealthCheck,
path='/status',
ready_path='/status/ready'
)
检查函数规范
健康检查函数
def health_check_function() -> bool:
"""
健康检查函数
Returns:
bool: True 表示健康,False 表示不健康
"""
pass
就绪检查函数
def ready_check_function() -> bool:
"""
就绪检查函数
Returns:
bool: True 表示就绪,False 表示未就绪
"""
pass
异常处理
如果检查函数抛出异常,检查状态将被标记为 error,并在响应中包含错误信息。
def check_database():
"""检查数据库连接"""
try:
db.connect()
return True
except Exception as e:
# 异常会被捕获并标记为 error
raise
常见检查示例
数据库检查
def check_database():
"""检查数据库连接"""
try:
import sqlite3
conn = sqlite3.connect('database.db')
conn.execute('SELECT 1')
conn.close()
return True
except Exception:
return False
Redis 检查
def check_redis():
"""检查 Redis 连接"""
try:
import redis
r = redis.Redis(host='localhost', port=6379)
r.ping()
return True
except Exception:
return False
磁盘空间检查
def check_disk_space():
"""检查磁盘空间"""
import shutil
total, used, free = shutil.disk_usage('.')
free_gb = free / (1024 ** 3)
return free_gb > 1.0 # 至少 1GB 可用空间
内存检查
def check_memory():
"""检查内存使用"""
import psutil
mem = psutil.virtual_memory()
return mem.available > 1024 * 1024 * 1024 # 至少 1GB 可用内存
外部 API 检查
def check_external_api():
"""检查外部 API"""
try:
import requests
response = requests.get('https://api.example.com/health', timeout=5)
return response.status_code == 200
except Exception:
return False
集成示例
与 Kubernetes 集成
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: litefs
spec:
replicas: 3
template:
spec:
containers:
- name: litefs
image: litefs:latest
ports:
- containerPort: 9090
livenessProbe:
httpGet:
path: /health
port: 9090
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 9090
initialDelaySeconds: 5
periodSeconds: 5
与 Docker Compose 集成
# docker-compose.yml
version: '3.8'
services:
litefs:
image: litefs:latest
ports:
- "9090:9090"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9090/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
与负载均衡器集成
# nginx.conf
upstream litefs {
server 127.0.0.1:9090;
}
server {
listen 80;
server_name example.com;
location /health {
proxy_pass http://litefs/health;
access_log off;
}
location / {
proxy_pass http://litefs;
}
}
最佳实践
1. 检查函数应该快速
健康检查函数应该在几秒钟内完成,避免超时。
def check_database():
"""好的实践:设置超时"""
try:
import sqlite3
conn = sqlite3.connect('database.db', timeout=5)
conn.execute('SELECT 1')
conn.close()
return True
except Exception:
return False
2. 检查函数应该是幂等的
多次调用检查函数应该返回相同的结果。
def check_database():
"""好的实践:不改变状态"""
try:
import sqlite3
conn = sqlite3.connect('database.db', timeout=5)
conn.execute('SELECT 1') # 只读操作
conn.close()
return True
except Exception:
return False
3. 区分健康检查和就绪检查
健康检查:检查服务是否正常运行
就绪检查:检查服务是否准备好处理请求
app.add_health_check('database', check_database_connection)
app.add_ready_check('migrations', check_database_migrations)
4. 提供有意义的检查名称
使用描述性的名称,便于问题排查。
app.add_health_check('database_primary', check_primary_db)
app.add_health_check('database_replica', check_replica_db)
app.add_health_check('cache_redis', check_redis_cache)
故障排查
健康检查返回 503
检查检查函数是否正确实现
检查检查函数是否抛出异常
检查依赖服务是否正常运行
查看日志中的错误信息
就绪检查返回 503
检查就绪检查函数是否正确实现
检查初始化过程是否完成
检查配置是否正确加载
查看日志中的错误信息
检查超时
优化检查函数,减少执行时间
为外部调用设置合理的超时
考虑使用异步检查
缓存检查结果
测试
健康检查功能包含完整的单元测试:
python tests/unit/test_health_check.py
测试覆盖:
✅ 默认初始化
✅ 自定义路径初始化
✅ 添加健康检查
✅ 添加就绪检查
✅ 非健康检查端点请求
✅ 非 GET 方法请求
✅ 所有检查通过
✅ 部分检查失败
✅ 检查抛出异常
✅ 没有检查时的响应
示例代码
完整的健康检查示例:
#!/usr/bin/env python
# coding: utf-8
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../src'))
from litefs import Litefs
from litefs.middleware import (
CORSMiddleware,
LoggingMiddleware,
SecurityMiddleware,
HealthCheck,
)
def check_database():
"""检查数据库连接"""
try:
import sqlite3
conn = sqlite3.connect('database.db', timeout=5)
conn.execute('SELECT 1')
conn.close()
return True
except Exception:
return False
def check_cache():
"""检查缓存服务"""
return True
def check_disk_space():
"""检查磁盘空间"""
import shutil
total, used, free = shutil.disk_usage('.')
return free > 1024 * 1024 * 1024 # 至少 1GB 可用空间
def check_external_api():
"""检查外部 API"""
return True
def check_migrations():
"""检查数据库迁移"""
return True
def main():
"""启动服务器"""
app = Litefs(webroot='./examples/basic/site', debug=True)
app.add_middleware(LoggingMiddleware)
app.add_middleware(SecurityMiddleware)
app.add_middleware(CORSMiddleware)
app.add_middleware(HealthCheck, path='/health', ready_path='/health/ready')
app.add_health_check('database', check_database)
app.add_health_check('cache', check_cache)
app.add_health_check('disk_space', check_disk_space)
app.add_health_check('external_api', check_external_api)
app.add_ready_check('migrations', check_migrations)
print("Starting Litefs server with health checks...")
print("Health check endpoint: http://localhost:9090/health")
print("Ready check endpoint: http://localhost:9090/health/ready")
app.run()
if __name__ == '__main__':
main()