内容大纲
1. 云资源优化概述
云资源优化是指通过合理配置和管理云资源,提高资源利用率,降低成本,同时确保服务质量的过程。随着企业上云的深入,云资源优化成为了云管理的重要组成部分。
云资源优化的目标包括:
- 提高资源利用率
- 降低云服务成本
- 提升系统性能
- 确保服务可靠性
- 实现资源的弹性伸缩
更多学习教程www.fgedu.net.cn
2. 计算资源优化
2.1 实例类型选择
# 通用型实例 – 适合大多数工作负载
$ aws ec2 run-instances \
–image-id ami-0c55b159cbfafe1f0 \
–instance-type t3.medium \
–count 1 \
–key-name my-key-pair \
–security-group-ids sg-12345678 \
–subnet-id subnet-12345678
# 计算优化型实例 – 适合CPU密集型工作负载
$ aws ec2 run-instances \
–image-id ami-0c55b159cbfafe1f0 \
–instance-type c5.large \
–count 1 \
–key-name my-key-pair \
–security-group-ids sg-12345678 \
–subnet-id subnet-12345678
# 内存优化型实例 – 适合内存密集型工作负载
$ aws ec2 run-instances \
–image-id ami-0c55b159cbfafe1f0 \
–instance-type r5.large \
–count 1 \
–key-name my-key-pair \
–security-group-ids sg-12345678 \
–subnet-id subnet-12345678
# 存储优化型实例 – 适合存储密集型工作负载
$ aws ec2 run-instances \
–image-id ami-0c55b159cbfafe1f0 \
–instance-type i3.large \
–count 1 \
–key-name my-key-pair \
–security-group-ids sg-12345678 \
–subnet-id subnet-12345678
2.2 自动伸缩
$ cat auto-scaling.json
{
“AutoScalingGroupName”: “my-auto-scaling-group”,
“MinSize”: 1,
“MaxSize”: 5,
“DesiredCapacity”: 2,
“LaunchConfigurationName”: “my-launch-config”,
“VPCZoneIdentifier”: “subnet-12345678,subnet-87654321”,
“Tags”: [
{
“ResourceId”: “my-auto-scaling-group”,
“ResourceType”: “auto-scaling-group”,
“Key”: “Name”,
“Value”: “my-auto-scaling-group”,
“PropagateAtLaunch”: true
}
]
}
# 创建Auto Scaling组
$ aws autoscaling create-auto-scaling-group –cli-input-json file://auto-scaling.json
# 创建伸缩策略
$ aws autoscaling put-scaling-policy \
–auto-scaling-group-name my-auto-scaling-group \
–policy-name ScaleUpPolicy \
–scaling-adjustment 1 \
–adjustment-type ChangeInCapacity \
–cooldown 300
$ aws autoscaling put-scaling-policy \
–auto-scaling-group-name my-auto-scaling-group \
–policy-name ScaleDownPolicy \
–scaling-adjustment -1 \
–adjustment-type ChangeInCapacity \
–cooldown 300
# 创建CloudWatch告警
$ aws cloudwatch put-metric-alarm \
–alarm-name HighCPU \
–alarm-description “Alarm when CPU exceeds 70%” \
–metric-name CPUUtilization \
–namespace AWS/EC2 \
–statistic Average \
–period 300 \
–threshold 70 \
–comparison-operator GreaterThanThreshold \
–dimensions Name=AutoScalingGroupName,Value=my-auto-scaling-group \
–evaluation-periods 2 \
–alarm-actions arn:aws:autoscaling:us-west-2:123456789012:scalingPolicy:12345678-1234-1234-1234-123456789012:autoScalingGroupName/my-auto-scaling-group:policyName/ScaleUpPolicy
$ aws cloudwatch put-metric-alarm \
–alarm-name LowCPU \
–alarm-description “Alarm when CPU is below 30%” \
–metric-name CPUUtilization \
–namespace AWS/EC2 \
–statistic Average \
–period 300 \
–threshold 30 \
–comparison-operator LessThanThreshold \
–dimensions Name=AutoScalingGroupName,Value=my-auto-scaling-group \
–evaluation-periods 2 \
–alarm-actions arn:aws:autoscaling:us-west-2:123456789012:scalingPolicy:12345678-1234-1234-1234-123456789012:autoScalingGroupName/my-auto-scaling-group:policyName/ScaleDownPolicy
风哥风哥提示:选择合适的实例类型和配置自动伸缩是计算资源优化的关键,可以根据工作负载特性和使用模式进行调整。
3. 存储资源优化
3.1 存储类型选择
# 标准存储 – 频繁访问
$ aws s3 cp file.txt s3://my-bucket/ –storage-class STANDARD
# 标准-IA存储 – 不频繁访问
$ aws s3 cp file.txt s3://my-bucket/ –storage-class STANDARD_IA
# 单 zone-IA存储 – 单可用区不频繁访问
$ aws s3 cp file.txt s3://my-bucket/ –storage-class ONEZONE_IA
# Glacier存储 – 归档存储
$ aws s3 cp file.txt s3://my-bucket/ –storage-class GLACIER
# Glacier Deep Archive – 深度归档存储
$ aws s3 cp file.txt s3://my-bucket/ –storage-class DEEP_ARCHIVE
# 智能分层存储
$ aws s3 cp file.txt s3://my-bucket/ –storage-class INTELLIGENT_TIERING
3.2 存储生命周期管理
$ cat lifecycle.json
{
“Rules”: [
{
“ID”: “Transition to IA”,
“Status”: “Enabled”,
“Prefix”: “logs/”,
“Transition”: {
“Days”: 30,
“StorageClass”: “STANDARD_IA”
}
},
{
“ID”: “Transition to Glacier”,
“Status”: “Enabled”,
“Prefix”: “archives/”,
“Transition”: {
“Days”: 90,
“StorageClass”: “GLACIER”
}
},
{
“ID”: “Expire objects”,
“Status”: “Enabled”,
“Prefix”: “temp/”,
“Expiration”: {
“Days”: 7
}
}
]
}
# 应用生命周期配置
$ aws s3api put-bucket-lifecycle-configuration \
–bucket my-bucket \
–lifecycle-configuration file://lifecycle.json
# 查看生命周期配置
$ aws s3api get-bucket-lifecycle-configuration –bucket my-bucket
3.3 存储优化工具
$ aws s3control create-storage-lens-configuration \
–config-id my-storage-lens \
–account-id 123456789012 \
–storage-lens-configuration ‘{“Id”:”my-storage-lens”,”AccountLevel”:{“ActivityMetrics”:{“IsEnabled”:true},”BucketLevel”:{“ActivityMetrics”:{“IsEnabled”:true}}},”IsEnabled”:true}’
# 查看存储分析报告
# 访问AWS控制台 -> S3 -> Storage Lens
# 使用CloudWatch监控存储使用情况
$ aws cloudwatch put-metric-alarm \
–alarm-name S3BucketSize \
–alarm-description “Alarm when S3 bucket size exceeds 1TB” \
–metric-name BucketSizeBytes \
–namespace AWS/S3 \
–statistic Average \
–period 86400 \
–threshold 1099511627776 \
–comparison-operator GreaterThanThreshold \
–dimensions Name=BucketName,Value=my-bucket Name=StorageType,Value=StandardStorage \
–evaluation-periods 1 \
–alarm-actions arn:aws:sns:us-west-2:123456789012:MyTopic
学习交流加群风哥微信: itpux-com
4. 网络资源优化
4.1 VPC设计优化
$ aws ec2 create-vpc \
–cidr-block 10.0.0.0/16 \
–tag-specifications ‘ResourceType=vpc,Tags=[{Key=Name,Value=my-vpc}]’
# 创建子网
$ aws ec2 create-subnet \
–vpc-id vpc-12345678 \
–cidr-block 10.0.1.0/24 \
–availability-zone us-west-2a \
–tag-specifications ‘ResourceType=subnet,Tags=[{Key=Name,Value=public-subnet-1}]’
$ aws ec2 create-subnet \
–vpc-id vpc-12345678 \
–cidr-block 10.0.2.0/24 \
–availability-zone us-west-2b \
–tag-specifications ‘ResourceType=subnet,Tags=[{Key=Name,Value=private-subnet-1}]’
# 创建Internet网关
$ aws ec2 create-internet-gateway \
–tag-specifications ‘ResourceType=internet-gateway,Tags=[{Key=Name,Value=my-igw}]’
$ aws ec2 attach-internet-gateway \
–vpc-id vpc-12345678 \
–internet-gateway-id igw-12345678
# 创建路由表
$ aws ec2 create-route-table \
–vpc-id vpc-12345678 \
–tag-specifications ‘ResourceType=route-table,Tags=[{Key=Name,Value=public-route-table}]’
$ aws ec2 create-route \
–route-table-id rtb-12345678 \
–destination-cidr-block 0.0.0.0/0 \
–gateway-id igw-12345678
$ aws ec2 associate-route-table \
–route-table-id rtb-12345678 \
–subnet-id subnet-12345678
4.2 网络性能优化
$ aws ec2 create-network-interface \
–subnet-id subnet-12345678 \
–description “My network interface” \
–groups sg-12345678 \
–tag-specifications ‘ResourceType=network-interface,Tags=[{Key=Name,Value=my-eni}]’
# 附加弹性网络接口
$ aws ec2 attach-network-interface \
–network-interface-id eni-12345678 \
–instance-id i-12345678 \
–device-index 1
# 配置增强型网络
$ aws ec2 modify-instance-attribute \
–instance-id i-12345678 \
–ena-support
# 使用VPC端点
$ aws ec2 create-vpc-endpoint \
–vpc-id vpc-12345678 \
–service-name com.amazonaws.us-west-2.s3 \
–vpc-endpoint-type Gateway \
–route-table-ids rtb-12345678
# 配置流量镜像
$ aws ec2 create-traffic-mirror-filter \
–description “My traffic mirror filter” \
–tag-specifications ‘ResourceType=traffic-mirror-filter,Tags=[{Key=Name,Value=my-filter}]’
$ aws ec2 create-traffic-mirror-target \
–network-interface-id eni-12345678 \
–description “My traffic mirror target” \
–tag-specifications ‘ResourceType=traffic-mirror-target,Tags=[{Key=Name,Value=my-target}]’
$ aws ec2 create-traffic-mirror-session \
–network-interface-id eni-87654321 \
–traffic-mirror-target-id tmt-12345678 \
–traffic-mirror-filter-id tmf-12345678 \
–session-number 1 \
–tag-specifications ‘ResourceType=traffic-mirror-session,Tags=[{Key=Name,Value=my-session}]’
学习交流加群风哥QQ113257174
5. 数据库资源优化
5.1 数据库实例优化
$ aws rds create-db-instance \
–db-instance-identifier mydb \
–allocated-storage 20 \
–db-instance-class db.t3.small \
–engine mysql \
–master-username admin \
–master-user-password password \
–vpc-security-group-ids sg-12345678 \
–availability-zone us-west-2a \
–backup-retention-period 7 \
–multi-az
# 启用自动缩放
$ aws rds modify-db-instance \
–db-instance-identifier mydb \
–auto-minor-version-upgrade \
–apply-immediately
# 监控数据库性能
$ aws cloudwatch put-metric-alarm \
–alarm-name RDSCPUUtilization \
–alarm-description “Alarm when RDS CPU exceeds 70%” \
–metric-name CPUUtilization \
–namespace AWS/RDS \
–statistic Average \
–period 300 \
–threshold 70 \
–comparison-operator GreaterThanThreshold \
–dimensions Name=DBInstanceIdentifier,Value=mydb \
–evaluation-periods 2 \
–alarm-actions arn:aws:sns:us-west-2:123456789012:MyTopic
5.2 数据库缓存
$ aws elasticache create-cache-cluster \
–cache-cluster-id my-cache \
–engine redis \
–cache-node-type cache.t3.medium \
–num-cache-nodes 1 \
–security-group-ids sg-12345678 \
–availability-zone us-west-2a
# 配置Redis参数
$ aws elasticache modify-cache-cluster \
–cache-cluster-id my-cache \
–cache-parameter-group-name default.redis6.x
# 监控缓存性能
$ aws cloudwatch put-metric-alarm \
–alarm-name ElastiCacheCPUUtilization \
–alarm-description “Alarm when ElastiCache CPU exceeds 80%” \
–metric-name CPUUtilization \
–namespace AWS/ElastiCache \
–statistic Average \
–period 300 \
–threshold 80 \
–comparison-operator GreaterThanThreshold \
–dimensions Name=CacheClusterId,Value=my-cache \
–evaluation-periods 2 \
–alarm-actions arn:aws:sns:us-west-2:123456789012:MyTopic
5.3 数据库备份优化
$ aws rds modify-db-instance \
–db-instance-identifier mydb \
–backup-retention-period 7 \
–apply-immediately
# 创建手动快照
$ aws rds create-db-snapshot \
–db-instance-identifier mydb \
–db-snapshot-identifier mydb-snapshot
# 复制快照到不同区域
$ aws rds copy-db-snapshot \
–source-db-snapshot-identifier arn:aws:rds:us-west-2:123456789012:snapshot:mydb-snapshot \
–target-db-snapshot-identifier mydb-snapshot-us-east-1 \
–source-region us-west-2 \
–region us-east-1
# 清理旧快照
$ aws rds delete-db-snapshot \
–db-snapshot-identifier mydb-snapshot
更多学习教程公众号风哥教程itpux_com
6. 成本优化
6.1 预留实例
$ aws ec2 purchase-reserved-instances-offering \
–instance-type t3.medium \
–availability-zone us-west-2a \
–term 1 \
–offering-type Standard \
–instance-count 1 \
–dry-run
# 购买RDS预留实例
$ aws rds purchase-reserved-db-instances-offering \
–reserved-db-instances-offering-id 12345678-1234-1234-1234-123456789012 \
–db-instance-count 1 \
–reserved-db-instance-id my-reserved-instance
# 查看预留实例使用情况
$ aws ec2 describe-reserved-instances
$ aws rds describe-reserved-db-instances
6.2 成本分配标签
$ aws tag create-cost-allocation-tag \
–tag-key Environment \
–tag-value Production
# 标记资源
$ aws ec2 create-tags \
–resources i-12345678 \
–tags Key=Environment,Value=Production Key=Department,Value=Engineering
$ aws s3api put-bucket-tagging \
–bucket my-bucket \
–tagging TagSet=[{Key=Environment,Value=Production},{Key=Department,Value=Engineering}]
# 查看成本报告
# 访问AWS控制台 -> Billing and Cost Management -> Cost Explorer
6.3 成本分析工具
$ aws ce get-cost-and-usage \
–time-period Start=2026-03-01,End=2026-03-31 \
–granularity MONTHLY \
–metrics BlendedCost \
–group-by Type=DIMENSION,Key=SERVICE
# 使用Budgets
$ aws budgets create-budget \
–account-id 123456789012 \
–budget ‘{“BudgetName”:”MonthlyBudget”,”BudgetType”:”COST”,”TimeUnit”:”MONTHLY”,”BudgetLimit”:{“Amount”:1000,”Unit”:”USD”}}’ \
–notifications-with-subscribers ‘[{“Notification”:{“NotificationType”:”ACTUAL”,”ComparisonOperator”:”GREATER_THAN”,”Threshold”:80,”ThresholdType”:”PERCENTAGE”},”Subscribers”:[{“SubscriptionType”:”EMAIL”,”Address”:”user@fgedu.net.cn”}]}]’
# 查看预算
$ aws budgets describe-budgets –account-id 123456789012
author:www.itpux.com
7. 监控与分析
7.1 CloudWatch监控
$ aws cloudwatch put-dashboard \
–dashboard-name MyDashboard \
–dashboard-body ‘{“widgets”:[{“type”:”metric”,”x”:0,”y”:0,”width”:12,”height”:6,”properties”:{“metrics”:[[“AWS/EC2″,”CPUUtilization”,”InstanceId”,”i-12345678″]],”period”:300,”stat”:”Average”,”region”:”us-west-2″,”title”:”EC2 CPU Utilization”}},{“type”:”metric”,”x”:0,”y”:6,”width”:12,”height”:6,”properties”:{“metrics”:[[“AWS/S3″,”BucketSizeBytes”,”BucketName”,”my-bucket”,”StorageType”,”StandardStorage”]],”period”:86400,”stat”:”Average”,”region”:”us-west-2″,”title”:”S3 Bucket Size”}}]}’
# 查看仪表盘
# 访问AWS控制台 -> CloudWatch -> Dashboards
# 配置详细监控
$ aws ec2 monitor-instances –instance-ids i-12345678
# 查看实例指标
$ aws cloudwatch get-metric-statistics \
–namespace AWS/EC2 \
–metric-name CPUUtilization \
–dimensions Name=InstanceId,Value=i-12345678 \
–start-time 2026-04-03T00:00:00Z \
–end-time 2026-04-03T12:00:00Z \
–period 3600 \
–statistics Average
7.2 日志分析
$ aws logs create-log-group –log-group-name my-log-group
$ aws logs create-log-stream \
–log-group-name my-log-group \
–log-stream-name my-log-stream
# 推送日志
$ aws logs put-log-events \
–log-group-name my-log-group \
–log-stream-name my-log-stream \
–log-events ‘[{“timestamp”:1649090400000,”message”:”Error: Connection failed”},{“timestamp”:1649090401000,”message”:”Info: Service started”}]’ \
–sequence-token 1234567890
# 创建日志指标过滤器
$ aws logs put-metric-filter \
–log-group-name my-log-group \
–filter-name ErrorFilter \
–filter-pattern “Error”
–metric-transformations ‘[{“metricName”:”ErrorCount”,”metricNamespace”:”MyApp”,”metricValue”:”1″}]’
# 查看日志
$ aws logs get-log-events \
–log-group-name my-log-group \
–log-stream-name my-log-stream
7.3 资源使用分析
$ aws support describe-trusted-advisor-checks –language en
$ aws support describe-trusted-advisor-check-result \
–check-id eW7HH0l7J9 \
–language en
# 使用AWS Compute Optimizer
$ aws compute-optimizer get-ec2-instance-recommendations
$ aws compute-optimizer get-auto-scaling-group-recommendations
$ aws compute-optimizer get-lambda-function-recommendations
8. 自动化优化
8.1 AWS Lambda自动化
$ cat lambda-function.py
import boto3
import datetime
def lambda_handler(event, context):
ec2 = boto3.client(‘ec2’)
# 停止未使用的EC2实例
response = ec2.describe_instances(Filters=[
{‘Name’: ‘instance-state-name’, ‘Values’: [‘running’]}
])
for reservation in response[‘Reservations’]:
for instance in reservation[‘Instances’]:
instance_id = instance[‘InstanceId’]
# 检查实例是否有标签表明需要一直运行
tags = {tag[‘Key’]: tag[‘Value’] for tag in instance.get(‘Tags’, [])}
if tags.get(‘AlwaysOn’) != ‘true’:
# 停止实例
ec2.stop_instances(InstanceIds=[instance_id])
print(f’Stopped instance: {instance_id}’)
return {
‘statusCode’: 200,
‘body’: ‘Successfully stopped unused instances’
}
# 创建Lambda函数
$ aws lambda create-function \
–function-name stop-unused-instances \
–runtime python3.8 \
–role arn:aws:iam::123456789012:role/lambda-role \
–handler lambda-function.lambda_handler \
–zip-file fileb://lambda-function.zip
# 创建CloudWatch Events规则
$ aws events put-rule \
–name stop-unused-instances \
–schedule-expression “cron(0 18 * * ? *)” \
–state ENABLED
# 为规则添加目标
$ aws events put-targets \
–rule stop-unused-instances \
–targets “[{\”Id\”:\”1\”,\”Arn\”:\”arn:aws:lambda:us-west-2:123456789012:function:stop-unused-instances\”}]”
# 授予Lambda权限
$ aws lambda add-permission \
–function-name stop-unused-instances \
–statement-id events-rule \
–action “lambda:InvokeFunction” \
–principal events.amazonaws.com \
–source-arn arn:aws:events:us-west-2:123456789012:rule/stop-unused-instances
8.2 AWS Systems Manager自动化
$ cat automation-doc.yaml
—
description: “Automatically update EC2 instances”
schemaVersion: “0.3”
assumeRole: “{{ AutomationAssumeRole }}”
parameters:
AutomationAssumeRole:
type: String
description: “(Required) The ARN of the role that allows Automation to perform the actions on your behalf.”
default: “”
InstanceIds:
type: StringList
description: “(Required) The IDs of the instances to update.”
default: []
mainSteps:
– name: updateInstances
action: “aws:runCommand”
inputs:
DocumentName: “AWS-RunPatchBaseline”
InstanceIds: “{{ InstanceIds }}”
Parameters:
Operation: “Scan”
# 创建自动化文档
$ aws ssm create-document \
–name UpdateEC2Instances \
–content file://automation-doc.yaml \
–document-type Automation
# 执行自动化
$ aws ssm start-automation-execution \
–document-name UpdateEC2Instances \
–parameters “AutomationAssumeRole=arn:aws:iam::123456789012:role/ssm-automation-role,InstanceIds=i-12345678”
# 查看自动化执行状态
$ aws ssm describe-automation-executions \
–filter Key=ExecutionId,Values=12345678-1234-1234-1234-123456789012
9. 最佳实践
9.1 计算资源最佳实践
- 选择合适的实例类型和大小
- 使用自动伸缩根据负载调整资源
- 利用预留实例和 savings plans 降低成本
- 定期检查和终止未使用的实例
- 使用Spot实例处理临时工作负载
9.2 存储资源最佳实践
- 选择合适的存储类型
- 使用生命周期策略自动管理存储
- 压缩和归档不常用的数据
- 定期清理临时和过期数据
- 使用存储分析工具监控使用情况
9.3 网络资源最佳实践
- 合理设计VPC和子网
- 使用VPC端点减少数据传输成本
- 配置适当的安全组和网络ACL
- 使用内容分发网络(CDN)提高性能
- 监控网络流量和性能
9.4 数据库资源最佳实践
- 选择合适的数据库实例类型
- 启用自动备份和多AZ部署
- 使用缓存减少数据库负载
- 优化数据库查询和索引
- 监控数据库性能和使用情况
9.5 成本管理最佳实践
- 设置成本预算和告警
- 使用成本分配标签追踪支出
- 定期审查和优化资源使用
- 利用预留实例和折扣计划
- 使用成本分析工具识别优化机会
10. 案例分析
10.1 电商平台资源优化
某电商平台通过以下措施实现了云资源优化:
- 使用自动伸缩根据流量调整EC2实例数量
- 采用预留实例和Spot实例混合部署
- 使用S3智能分层存储管理商品图片和视频
- 配置CloudFront CDN提高访问速度
- 实施成本分配标签追踪各业务线支出
结果:
- 成本降低了30%
- 性能提升了40%
- 资源利用率从40%提高到70%
10.2 金融机构资源优化
某金融机构通过以下措施实现了云资源优化:
- 使用RDS预留实例降低数据库成本
- 实施数据生命周期管理,将冷数据迁移到低成本存储
- 使用VPC端点减少数据传输成本
- 配置详细的监控和告警系统
- 自动化资源管理和优化流程
结果:
- 成本降低了25%
- 合规性得到提升
- 系统可靠性提高到99.99%
生产环境建议
- 建立完善的云资源优化策略
- 定期进行资源使用分析
- 实施自动化资源管理
- 监控和优化成本支出
- 持续学习和应用云服务的新特性
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
