it教程FG107-AI模型部署与管理

1. 模型部署概述

AI模型部署是将训练好的模型应用到实际生产环境的过程，涉及模型的打包、部署、监控和管理等环节。更多学习教程www.fgedu.net.cn

生产环境风哥建议：模型部署前需要进行充分的测试，包括功能测试、性能测试和安全测试，确保模型在生产环境中稳定运行。

2. 模型部署方法

常见的模型部署方法包括：

本地部署：直接在服务器上部署模型
容器化部署：使用Docker等容器技术部署模型
云平台部署：使用云服务提供商的AI服务部署模型
边缘部署：在边缘设备上部署模型

2.1 本地部署

本地部署是最直接的部署方式，适用于小规模应用场景。

# 保存训练好的模型
$ python
>>> import joblib
>>> from sklearn.datasets import load_iris
>>> from sklearn.ensemble import RandomForestClassifier
>>>
>>> # 加载数据并训练模型
>>> X, y = load_iris(return_X_y=True)
>>> model = RandomForestClassifier()
>>> model.fit(X, y)
>>>
>>> # 保存模型
>>> joblib.dump(model, ‘iris_model.joblib’)
[‘iris_model.joblib’]
>>> exit()

# 部署模型为API
$ pip install flask

# 创建API服务文件
$ vi app.py
from flask import Flask, request, jsonify
import joblib
import numpy as np

app = Flask(__name__)
model = joblib.load(‘iris_model.joblib’)

@app.route(‘/predict’, methods=[‘POST’])
def predict():
data = request.get_json()
features = np.array(data[‘features’]).reshape(1, -1)
prediction = model.predict(features)
return jsonify({‘prediction’: int(prediction[0])})

if __name__ == ‘__main__’:
app.run(host=’0.0.0.0′, port=5000)

# 启动API服务
$ python app.py
* Serving Flask app ‘app’ (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
* Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)

# 测试API
$ curl -X POST http://fgedudb:5000/predict -H “Content-Type: application/json” -d ‘{“features”: [5.1, 3.5, 1.4, 0.2]}’
{“prediction”: 0}

$ curl -X POST http://fgedudb:5000/predict -H “Content-Type: application/json” -d ‘{“features”: [6.2, 2.9, 4.3, 1.3]}’
{“prediction”: 1}

3. 容器化部署

容器化部署使用Docker技术，具有环境一致性、可移植性和隔离性等优势，学习交流加群风哥微信: itpux-com。

# 创建Dockerfile
$ vi Dockerfile
FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install –no-cache-dir -r requirements.txt

COPY app.py .
COPY iris_model.joblib .

EXPOSE 5000

CMD [“python”, “app.py”]

# 创建requirements.txt
$ vi requirements.txt
Flask
scikit-learn
joblib
numpy

# 构建Docker镜像
$ docker build -t iris-model-api .
Sending build context to Docker daemon 15.36kB
Step 1/7 : FROM python:3.9-slim
—> a9042894e594
Step 2/7 : WORKDIR /app
—> Using cache
—> 7f8e7d9a6c7c
Step 3/7 : COPY requirements.txt .
—> Using cache
—> 8a3b2c3d4e5f
Step 4/7 : RUN pip install –no-cache-dir -r requirements.txt
—> Using cache
—> 9a8b7c6d5e4f
Step 5/7 : COPY app.py .
—> Using cache
—> 8a7b6c5d4e3f
Step 6/7 : COPY iris_model.joblib .
—> Using cache
—> 7a6b5c4d3e2f
Step 7/7 : EXPOSE 5000
—> Using cache
—> 6a5b4c3d2e1f
Successfully built 6a5b4c3d2e1f
Successfully tagged iris-model-api:latest

# 运行Docker容器
$ docker run -d -p 5000:5000 –name iris-api iris-model-api
4a3b2c1d0e9f8g7h6i5j4k3l2m1n0o

# 查看容器状态
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4a3b2c1d0e9f iris-model-api “python app.py” 5 seconds ago Up 4 seconds 0.0.0.0:5000->5000/tcp iris-api

4. 云平台部署

云平台部署利用云服务提供商的AI服务，简化部署流程并提供弹性扩展能力。

4.1 AWS SageMaker部署

# 安装SageMaker SDK
$ pip install sagemaker

# 上传模型到S3
$ aws s3 cp iris_model.joblib s3://my-model-bucket/iris-model.joblib

# 创建SageMaker模型
$ python
>>> import sagemaker
>>> from sagemaker.sklearn.model import SKLearnModel
>>>
>>> sagemaker_session = sagemaker.Session()
>>> role = sagemaker.get_execution_role()
>>>
>>> model = SKLearnModel(
… model_data=’s3://my-model-bucket/iris-model.joblib’,
… role=role,
… entry_point=’inference.py’,
… framework_version=’0.23-1′
… )
>>>
>>> # 部署模型
>>> predictor = model.deploy(
… initial_instance_count=1,
… instance_type=’ml.t2.medium’
… )
>>>
>>> # 测试模型
>>> predictor.predict([[5.1, 3.5, 1.4, 0.2]])
array([0])
>>>
>>> # 清理资源
>>> predictor.delete_endpoint()

4.2 Azure Machine Learning部署

# 安装Azure ML SDK
$ pip install azureml-core

# 部署模型
$ python
>>> from azureml.core import Workspace, Model
>>> from azureml.core.webservice import AciWebservice, Webservice
>>> from azureml.core.model import InferenceConfig
>>>
>>> # 加载工作区
>>> ws = Workspace.from_config()
>>>
>>> # 注册模型
>>> model = Model.register(
… workspace=ws,
… model_path=’iris_model.joblib’,
… model_name=’iris-model’,
… description=’Iris classification model’
… )
>>>
>>> # 创建推理配置
>>> inference_config = InferenceConfig(
… entry_script=’score.py’,
… environment=env
… )
>>>
>>> # 部署到ACI
>>> deployment_config = AciWebservice.deploy_configuration(
… cpu_cores=1,
… memory_gb=1
… )
>>>
>>> service = Model.deploy(
… workspace=ws,
… name=’iris-classification-service’,
… models=[model],
… inference_config=inference_config,
… deployment_config=deployment_config
… )
>>>
>>> service.wait_for_deployment(show_output=True)

风哥风哥提示：云平台部署可以根据实际需求选择不同的服务类型，如无服务器计算、容器服务或专用AI服务，以获得最佳的性能和成本效益。

5. 模型管理

模型管理包括模型版本控制、模型注册、模型元数据管理等内容，学习交流加群风哥QQ113257174。

5.1 模型注册与版本控制

# 使用MLflow进行模型管理
$ pip install mlflow

# 注册模型
$ python
>>> import mlflow
>>> import mlflow.sklearn
>>> from sklearn.datasets import load_iris
>>> from sklearn.ensemble import RandomForestClassifier
>>>
>>> # 启用自动日志记录
>>> mlflow.autolog()
>>>
>>> # 训练模型
>>> X, y = load_iris(return_X_y=True)
>>> model = RandomForestClassifier()
>>> model.fit(X, y)
>>>
>>> # 注册模型
>>> model_uri = “runs:/{run_id}/model”.format(run_id=mlflow.active_run().info.run_id)
>>> mlflow.register_model(model_uri, “iris-classifier”)

6. 模型监控

模型监控是确保模型在生产环境中持续正常运行的关键，包括性能监控、数据漂移监控和模型退化监控。

6.1 性能监控

# 使用Prometheus和Grafana监控模型性能

# 创建监控脚本
$ vi monitor.py
import time
import requests
from prometheus_client import start_http_server, Gauge

# 创建指标
REQUEST_TIME = Gauge(‘request_processing_seconds’, ‘Time spent processing request’)
REQUEST_COUNT = Gauge(‘request_count’, ‘Number of requests’)
ERROR_COUNT = Gauge(‘error_count’, ‘Number of errors’)

# 启动监控服务器
start_http_server(8000)

while True:
try:
start_time = time.time()
response = requests.post(
‘http://fgedudb:5000/predict’,
json={‘features’: [5.1, 3.5, 1.4, 0.2]}
)
REQUEST_TIME.set(time.time() – start_time)
REQUEST_COUNT.inc()
if response.status_code != 200:
ERROR_COUNT.inc()
except Exception as e:
ERROR_COUNT.inc()
time.sleep(10)

# 运行监控脚本
$ python monitor.py

6.2 数据漂移监控

# 使用Evidently AI监控数据漂移
$ pip install evidently

# 创建数据漂移监控脚本
$ vi data_drift_monitor.py
import pandas as pd
from sklearn.datasets import load_iris
from evidently.dashboard import Dashboard
from evidently.pipeline.column_mapping import ColumnMapping
from evidently.dashboard.tabs import DataDriftTab

# 加载参考数据
iris = load_iris()
reference_data = pd.DataFrame(
data=np.c_[iris[‘data’], iris[‘target’]],
columns=iris[‘feature_names’] + [‘target’]
)

# 模拟生产数据（添加一些漂移）
production_data = reference_data.copy()
production_data[‘sepal length (cm)’] *= 1.1 # 引入数据漂移

# 创建列映射
column_mapping = ColumnMapping(
target=’target’,
numerical_features=iris[‘feature_names’]
)

# 创建仪表板
dashboard = Dashboard(tabs=[DataDriftTab()])
dashboard.calculate(reference_data, production_data, column_mapping=column_mapping)
dashboard.save(‘data_drift_report.html’)

# 运行监控脚本
$ python data_drift_monitor.py

7. 模型更新与版本控制

模型更新是保持模型性能的重要手段，需要建立完善的版本控制和更新流程。

# 使用DVC进行模型版本控制
$ pip install dvc

# 初始化DVC
$ dvc init

# 添加模型文件到DVC
$ dvc add iris_model.joblib

# 提交更改
$ git add .
$ git commit -m “Add model version 1”

# 训练新模型
$ python train_new_model.py

# 更新模型
$ dvc add iris_model.joblib
$ dvc commit

# 提交更改
$ git add .
$ git commit -m “Update model to version 2”

# 查看模型版本历史
$ dvc log iris_model.joblib.dvc

8. 最佳实践

生产环境风哥建议：

使用容器化技术确保环境一致性
建立完善的模型版本控制体系
实施持续集成和持续部署（CI/CD）流程
建立模型监控体系，及时发现和解决问题
定期评估模型性能，必要时进行更新
确保模型部署的安全性，包括访问控制和数据加密

风哥风哥提示：模型部署是AI项目成功的关键环节，需要综合考虑性能、可靠性、安全性和可维护性等因素。

更多学习教程公众号风哥教程itpux_com

author:www.itpux.com

本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html