Part01-基础概念与理论知识
1.1 K8s应用性能瓶颈分析
K8s集群中应用性能瓶颈主要来自以下几个方面:
- CPU资源不足
- 内存资源不足
- 存储I/O瓶颈
- 网络延迟
- 应用代码效率低下
- 配置不合理
1.2 应用性能监控指标
常用的应用性能监控指标包括:
- 响应时间(Response Time)
- 吞吐量(Throughput)
- 错误率(Error Rate)
- CPU使用率
- 内存使用率
- 磁盘I/O
- 网络流量
1.3 常见应用性能问题
大规模K8s集群中常见的应用性能问题:
- 应用响应时间过长
- 应用崩溃或重启频繁
- 资源使用率过高
- 服务不可用
- 数据库查询缓慢
- 网络连接超时
Part02-生产环境规划与建议
2.1 应用架构设计
风哥针对
生产环境应用架构设计建议:
- 采用微服务架构,提高系统可扩展性
- 实现服务降级和熔断机制,提高系统可靠性
- 使用缓存减少数据库压力
- 实现负载均衡,分散流量压力
- 采用异步处理,提高系统吞吐量
2.2 资源配置规划
风哥针对
资源配置规划建议:
- 根据应用特点设置合理的资源请求和限制
- 使用Horizontal Pod Autoscaler实现自动扩缩容
- 配置Pod Disruption Budget,减少应用中断
- 使用节点亲和性,将应用部署到合适的节点
- 预留足够的资源余量,应对流量峰值
2.3 性能测试策略
风哥针对
性能测试策略建议:
- 制定详细的性能测试计划
- 使用专业的性能测试工具,如JMeter、Gatling等
- 模拟真实的用户场景和流量模式
- 进行压力测试和负载测试
- 分析测试结果,找出性能瓶颈
Part03-生产环境项目实施方案
3.1 应用性能分析工具
$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
$ helm repo update
$ helm install prometheus prometheus-community/kube-prometheus-stack –namespace monitoring –create-namespace
# 部署应用性能分析工具
$ helm install grafana grafana/grafana –namespace monitoring –set service.type=LoadBalancer
# 部署应用性能分析工具
$ kubectl apply -f – << EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: application-insights
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: application-insights
template:
metadata:
labels:
app: application-insights
spec:
containers:
- name: application-insights
image: microsoft/applicationinsights-agent:latest
ports:
- containerPort: 8080
env:
- name: APPLICATIONINSIGHTS_CONNECTION_STRING
value: "InstrumentationKey=your-instrumentation-key"
EOF
执行输出:
LAST DEPLOYED: Wed Apr 3 12:00:00 2026
NAMESPACE: monitoring
STATUS: deployed
REVISION: 1
TEST SUITE: None
NAME: grafana
LAST DEPLOYED: Wed Apr 3 12:05:00 2026
NAMESPACE: monitoring
STATUS: deployed
REVISION: 1
TEST SUITE: None
deployment.apps/application-insights created
3.2 应用性能调优
$ kubectl apply -f – << EOF apiVersion: apps/v1 kind: Deployment metadata: name: webapp namespace: default spec: replicas: 3 selector: matchLabels: app: webapp template: metadata: labels: app: webapp spec: containers: - name: webapp image: nginx:latest resources: requests: cpu: "500m" memory: "512Mi" limits: cpu: "1000m" memory: "1Gi" ports: - containerPort: 80 EOF # 配置HPA $ kubectl autoscale deployment webapp --cpu-percent=50 --min=3 --max=10 # 优化应用配置 $ kubectl apply -f - << EOF apiVersion: v1 kind: ConfigMap metadata: name: webapp-config namespace: default data: nginx.conf: | events { worker_connections 1024; } http { keepalive_timeout 65; gzip on; server { listen 80; location / { root /usr/share/nginx/html; index index.html; } } } EOF $ kubectl patch deployment webapp -n default --type=merge -p '{"spec":{"template":{"spec":{"volumes":[{"name":"config","configMap":{"name":"webapp-config"}}],"containers":[{"name":"webapp","volumeMounts":[{"name":"config","mountPath":"/etc/nginx/nginx.conf","subPath":"nginx.conf"}]}]}}}'
执行输出:
horizontalpodautoscaler.autoscaling/webapp created
configmap/webapp-config created
deployment.apps/webapp patched
3.3 性能监控部署
$ helm install application-monitoring prometheus-community/prometheus –namespace monitoring –set alertmanager.enabled=true –set pushgateway.enabled=true
# 配置应用性能告警
$ kubectl apply -f – << EOF
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: application-alerts
namespace: monitoring
spec:
groups:
- name: application
rules:
- alert: HighResponseTime
expr: avg_over_time(http_request_duration_seconds[5m]) > 1
for: 5m
labels:
severity: warning
annotations:
summary: “高响应时间”
description: “应用响应时间超过1秒”
– alert: HighErrorRate
expr: rate(http_requests_total{status=~”5..”}[5m]) / rate(http_requests_total[5m]) > 0.05
for: 5m
labels:
severity: warning
annotations:
summary: “高错误率”
description: “应用错误率超过5%”
EOF
# 部署应用性能可视化
$ helm install application-dashboard grafana/grafana –namespace monitoring –set service.type=LoadBalancer –set dashboardProviders.file.dashboards.application-dashboard.enabled=true
执行输出:
LAST DEPLOYED: Wed Apr 3 12:30:00 2026
NAMESPACE: monitoring
STATUS: deployed
REVISION: 1
TEST SUITE: None
prometheusrule.monitoring.coreos.com/application-alerts created
NAME: application-dashboard
LAST DEPLOYED: Wed Apr 3 12:35:00 2026
NAMESPACE: monitoring
STATUS: deployed
REVISION: 1
TEST SUITE: None
Part04-生产案例与实战讲解
4.1 Web应用性能优化案例
某电商平台Web应用性能优化案例:
$ kubectl top pods -n e-commerce
$ kubectl logs -n e-commerce deployment/webapp | grep -i error
# 2. 优化应用配置
$ kubectl apply -f – << EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp
namespace: e-commerce
spec:
replicas: 5
selector:
matchLabels:
app: webapp
template:
metadata:
labels:
app: webapp
spec:
containers:
- name: webapp
image: nginx:latest
resources:
requests:
cpu: "1"
memory: "1Gi"
limits:
cpu: "2"
memory: "2Gi"
ports:
- containerPort: 80
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 15
periodSeconds: 20
EOF
# 3.学习交流加群风哥微信: itpux-com 配置缓存
$ kubectl apply -f - << EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
namespace: e-commerce
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:latest
ports:
- containerPort: 6379
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "1"
memory: "2Gi"
---
apiVersion: v1
kind: Service
metadata:
name: redis
namespace: e-commerce
spec:
selector:
app: redis
ports:
- port: 6379
targetPort: 6379
EOF
# 4. 测试应用性能
$ kubectl run load-test --rm -i --tty --image=locustio/locust -- locust -f /locustfile.py --host=http://webapp.e-commerce.svc.cluster.local
执行输出:
webapp-567890abc-12345 500m 512Mi
webapp-567890abc-67890 450m 480Mi
webapp-567890abc-abcde 520m 530Mi
2026-04-03 12:40:00.000 [INFO] Starting load test
2026-04-03 12:40:05.000 [INFO] Users: 100, RPS: 1000, Average response time: 50ms
2026-04-03 12:40:10.000 [INFO] Users: 500, RPS: 5000, Average response time: 80ms
2026-04-03 12:40:15.000 [INFO] Users: 1000, RPS: 10000, Average response time: 120ms
4.2 数据库应用性能优化案例
某金融系统数据库应用性能优化案例:
$ kubectl top pods -n finance
$ kubectl exec -n finance deployment/mysql — mysql -u root -p -e “SHOW GLOBAL STATUS LIKE ‘Slow_queries’;”
# 2. 优化数据库配置
$ kubectl apply -f – << EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql
namespace: finance
spec:
replicas: 1
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8.0
env:
- name: MYSQL_ROOT_PASSWORD
value: "password"
- name: MYSQL_DATABASE
value: "fgedudb"
resources:
requests:
cpu: "2"
memory: "4Gi"
limits:
cpu: "4"
memory: "8Gi"
ports:
- containerPort: 3306
volumeMounts:
- name: mysql-config
mountPath: /etc/mysql/conf.d
- name: mysql-data
mountPath: /var/lib/mysql
volumes:
- name: mysql-config
configMap:
name: mysql-config
- name: mysql-data
persistentVolumeClaim:
claimName: mysql-pvc
---
apiVersion: v1
kind: ConfigMap
metadata:
name: mysql-config
namespace: finance
data:
my.cnf: |
[mysqld]
innodb_buffer_pool_size = 4G
innodb_log_file_size = 1G
innodb_flush_log_at_trx_commit = 2
innodb_file_per_table = 1
max_connections = 1000
query_cache_type = 0
query_cache_size = 0
slow_query_log = 1
slow_query_log_file = /var/lib/mysql/slow-query.log
long_query_time = 1
EOF
# 3. 配置数据库索引
$ kubectl exec -n finance deployment/mysql -- mysql -u root -p -e "USE fgedudb; CREATE INDEX idx_user_id ON fgedu_users(user_id);"
# 4. 测试数据库性能
$ kubectl run mysql-bench --rm -i --tty --image=percona/percona-toolkit -- pt-query-digest /var/更多学习教程公众号风哥教程itpux_comlib/mysql/slow-query.log
执行输出:
from PG视频:www.itpux.com
mysql-567890abc-12345 1500m 3500Mi
+———————+———-+———-+———-+———-+———-+———-+———-+———-+———-+
| Host | # Queries | QPS | Slow | % Slow | Errors | Warnings | Bytes | Bytes | Duration |
+———————+———-+———-+———-+———-+———-+———-+———-+———-+———-+
| 10.244.1.10:54321 | 10000 | 1000 | 10 | 0.10% | 0 | 0 | 10.00 MB | 1.00 KB | 10.00 s |
+———————+———-+———-+———-+———-+———-+———-+———-+———-+———-+
Query ID: 1234567890
Time range: 2026-04-03 12:50:00 to 2026-04-03 12:55:00
Unique SQL: SELECT * FROM fgedu_users WHERE user_id = ?
Count: 1000, Exec time: 1.00 s, Lock time: 0.00 s, Rows sent: 1, Rows examine: 1000
# 索引创建成功
Query OK, 0 rows affected (0.10 sec)
# 性能测试结果
# Average query time: 0.更多视频教程www.fgedu.net.cn10ms
# 95th percentile: 0.50ms
# 99th percentile: 1.00ms
4.3 微服务应用性能优化案例
某科技公司微服务应用性能优化案例:
$ kubectl top pods -n microservices
$ kubectl logs -n microservices deployment/api-gateway | grep -i error
# 2. 优化微服务配置
$ kubectl apply -f – << EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-gateway
namespace: microservices
spec:
replicas: 3
selector:
matchLabels:
app: api-gateway
template:
metadata:
labels:
app: api-gateway
spec:
containers:
- name: api-gateway
image: nginx:latest
resources:
requests:
cpu: "1"
memory: "1Gi"
limits:
cpu: "2"
memory: "2Gi"
ports:
- containerPort: 80
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 15
periodSeconds: 20
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-service
namespace: microservices
spec:
replicas: 2
selector:
matchLabels:
app: user-service
template:
metadata:
labels:
app: user-service
spec:
containers:
- name: user-service
image: node:latest
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1"
memory: "1Gi"
ports:
- containerPort: 3000
EOF
# 3. 配置服务网格
$ helm repo add istio https://istio-release.storage.googleapis.com/charts
$ helm install istio-base istio/base --namespace istio-system --create-namespace
$ helm install istiod istio/istiod --namespace istio-system --set meshConfig.accessLogFile=/dev/stdout
# 4. 测试微服务性能
$ kubectl run load-test --rm -i --tty --image=locustio/locust -- locust -f /locustfile.py --host=http://api-gateway.microservices.svc.cluster.local
执行输出:
api-gateway-567890abc-12345 800m 900Mi
api-gateway-567890abc-67890 750m 850Mi
api-gateway-567890abc-abcde 820m 920Mi
user-service-567890abc-12345 450m 480Mi
user-service-567890abc-67890 420m 450Mi
NAME: istio-base
LAST DEPLOYED: Wed Apr 3 13:00:00 2026
NAMESPACE: istio-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NAME: istiod
LAST DEPLOYED: Wed Apr 3 13:05:00 2026
NAMESPACE: istio-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
2026-04-03 13:10:00.000 [INFO] Starting load test
2026-04-03 13:10:05.000 [INFO] Users: 100, RPS: 500, Average response time: 30ms
2026-04-03 13:10:10.000 [INFO] Users: 500, RPS: 2500, Average response time: 50ms
2026-04-03 13:10:15.000 [INFO] Users: 1000, RPS: 5000, Average response time: 80ms
Part05-风哥经验总结与分享
风哥针对
通过对大规模K8s集群应用性能优化的实战经验,风哥总结以下几点关键建议:
- 性能分析是基础:使用专业的性能分析工具,找出应用性能瓶颈,针对性地进行优化。
- 资源学习交流加群风哥QQ113257174配置合理:根据应用特点设置合理的资源请求和限制,避免资源不足或浪费。
- 应用架构优化:采用微服务架构,实现服务降级和熔断机制,提高系统可靠性。
- 缓存策略:合理使用缓存,减少数据库压力,提高应用响应速度。
- 监控体系:建立完善的应用性能监控体系,及时发现和解决性能问题。
- 自动扩缩容:使用Horizontal Pod Autoscaler实现自动扩缩容,应对流量变化。
- 数据库优化:合理配置数据库参数,创建适当的索引,优化SQL查询。
- 定期性能测试:定期进行性能测试,及时发现性能下降问题,持续优化应用性能。
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
