1. 系统自动化简介
系统自动化是指使用脚本、工具和流程来自动执行IT系统管理任务,减少人工干预,提高效率和可靠性。更多学习教程www.fgedu.net.cn
系统自动化的主要目标:
- 减少人工错误
- 提高工作效率
- 确保一致性
- 节省时间和成本
- 实现标准化
2. Shell脚本编写
Shell脚本是最基本的自动化工具,用于执行一系列命令。
# vi hello.sh
#!/bin/bash
# 注释
echo “Hello, World!”
# 变量
NAME=”World”
echo “Hello, $NAME!”
# 条件判断
if [ “$NAME” == “World” ]; then
echo “Hello, $NAME!”
else
echo “Hello, $NAME! Nice to meet you.”
fi
# 循环
for i in {1..5}; do
echo “Iteration $i”
done
# 函数
function greet {
local name=$1
echo “Hello, $name!”
}
greet “John”
# 执行脚本
# chmod +x hello.sh
# ./hello.sh
Hello, World!
Hello, World!
Hello, World!
Iteration 1
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Hello, John!
# vi backup.sh
#!/bin/bash
# 配置
BACKUP_DIR=”/backup”
SOURCE_DIRS=”/etc /home /var/www”
DATE=$(date +”%Y-%m-%d”)
BACKUP_FILE=”$BACKUP_DIR/backup-$DATE.tar.gz”
# 创建备份目录
mkdir -p $BACKUP_DIR
# 执行备份
echo “Starting backup…”
tar -czf $BACKUP_FILE $SOURCE_DIRS
# 检查备份是否成功
if [ $? -eq 0 ]; then
echo “Backup successful: $BACKUP_FILE”
# 删除7天前的备份
find $BACKUP_DIR -name “backup-*.tar.gz” -mtime +7 -delete
echo “Old backups cleaned up”
else
echo “Backup failed”
exit 1
fi
# 执行脚本
# chmod +x backup.sh
# ./backup.sh
Starting backup…
Backup successful: /backup/backup-2026-03-30.tar.gz
Old backups cleaned up
3. Python脚本编写
Python是一种强大的脚本语言,用于编写更复杂的自动化任务。
# vi hello.py
#!/usr/bin/env python3
# 打印
print(“Hello, World!”)
# 变量
name = “World”
print(f”Hello, {name}!”)
# 条件判断
if name == “World”:
print(f”Hello, {name}!”)
else:
print(f”Hello, {name}! Nice to meet you.”)
# 循环
for i in range(1, 6):
print(f”Iteration {i}”)
# 函数
def greet(name):
print(f”Hello, {name}!”)
greet(“John”)
# 执行脚本
# python3 hello.py
Hello, World!
Hello, World!
Hello, World!
Iteration 1
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Hello, John!
# vi system_info.py
#!/usr/bin/env python3
import os
import platform
import psutil
def get_system_info():
info = {}
# 系统信息
info[‘os’] = platform.system()
info[‘os_version’] = platform.version()
info[‘hostname’] = platform.node()
# CPU信息
info[‘cpu_count’] = psutil.cpu_count()
info[‘cpu_percent’] = psutil.cpu_percent(interval=1)
# 内存信息
memory = psutil.virtual_memory()
info[‘memory_total’] = memory.total // (1024 * 1024)
info[‘memory_used’] = memory.used // (1024 * 1024)
info[‘memory_free’] = memory.free // (1024 * 1024)
# 磁盘信息
disk = psutil.disk_usage(‘/’)
info[‘disk_total’] = disk.total // (1024 * 1024 * 1024)
info[‘disk_used’] = disk.used // (1024 * 1024 * 1024)
info[‘disk_free’] = disk.free // (1024 * 1024 * 1024)
# 网络信息
net_io = psutil.net_io_counters()
info[‘bytes_sent’] = net_io.bytes_sent // (1024 * 1024)
info[‘bytes_recv’] = net_io.bytes_recv // (1024 * 1024)
return info
def main():
info = get_system_info()
print(“System Information:”)
print(f”OS: {info[‘os’]} {info[‘os_version’]}”)
print(f”Hostname: {info[‘hostname’]}”)
print(f”CPU: {info[‘cpu_count’]} cores, {info[‘cpu_percent’]}% usage”)
print(f”Memory: {info[‘memory_used’]} MB used, {info[‘memory_free’]} MB free, {info[‘memory_total’]} MB total”)
print(f”Disk: {info[‘disk_used’]} GB used, {info[‘disk_free’]} GB free, {info[‘disk_total’]} GB total”)
print(f”Network: {info[‘bytes_sent’]} MB sent, {info[‘bytes_recv’]} MB received”)
if __name__ == “__main__”:
main()
# 安装依赖
# pip install psutil
# 执行脚本
# python3 system_info.py
System Information:
OS: Linux 5.4.17-2136.302.7.2.el7uek.x86_64
Hostname: server-1
CPU: 8 cores, 10.5% usage
Memory: 2048 MB used, 6144 MB free, 8192 MB total
Disk: 20 GB used, 80 GB free, 100 GB total
Network: 100 MB sent, 200 MB received
4. Ansible自动化
Ansible是一种配置管理和自动化工具,用于自动化IT基础设施管理。
# yum install -y ansible
# 创建Ansible配置文件
# vi /etc/ansible/ansible.cfg
[defaults]
inventory = /etc/ansible/hosts
remote_user = root
# 创建主机清单
# vi /etc/ansible/hosts
[webservers]
web1 ansible_host=192.168.1.10
web2 ansible_host=192.168.1.11
[databases]
db1 ansible_host=192.168.1.20
# 测试连接
# ansible all -m ping
web1 | SUCCESS => {
“changed”: false,
“ping”: “pong”
}
web2 | SUCCESS => {
“changed”: false,
“ping”: “pong”
}
db1 | SUCCESS => {
“changed”: false,
“ping”: “pong”
}
# 创建Ansible Playbook
# vi install-nginx.yml
—
– name: Install and configure Nginx
hosts: webservers
become: yes
tasks:
– name: Install Nginx
yum:
name: nginx
state: present
– name: Start Nginx service
service:
name: nginx
state: started
enabled: yes
– name: Create index.html
copy:
content: “
Welcome to {{ inventory_hostname }}
”
dest: /usr/share/nginx/html/index.html
– name: Allow HTTP traffic
firewalld:
service: http
state: enabled
permanent: yes
immediate: yes
# 执行Playbook
# ansible-playbook install-nginx.yml
PLAY [Install and configure Nginx] ********************************************************************
TASK [Gathering Facts] ********************************************************************************
ok: [web1]
ok: [web2]
TASK [Install Nginx] **********************************************************************************
ok: [web1]
ok: [web2]
TASK [Start Nginx service] ****************************************************************************
ok: [web1]
ok: [web2]
TASK [Create index.html] ******************************************************************************
ok: [web1]
ok: [web2]
TASK [Allow HTTP traffic] ****************************************************************************
ok: [web1]
ok: [web2]
PLAY RECAP ******************************************************************************************
web1 : ok=5 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
web2 : ok=5 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
5. Terraform基础设施即代码
Terraform是一种基础设施即代码工具,用于自动化云基础设施的创建和管理。
# wget https://releases.hashicorp.com/terraform/1.1.7/terraform_1.1.7_linux_amd64.zip
# unzip terraform_1.1.7_linux_amd64.zip
# mv terraform /usr/local/bin/
# 验证安装
# terraform –version
Terraform v1.1.7
# 创建Terraform配置文件
# vi main.tf
provider “aws” {
region = “us-west-2”
}
resource “aws_instance” “web” {
ami = “ami-0c55b159cbfafe1f0”
instance_type = “t2.micro”
tags = {
Name = “WebServer”
}
}
resource “aws_security_group” “web” {
name = “web-security-group”
description = “Allow HTTP traffic”
ingress {
from_port = 80
to_port = 80
protocol = “tcp”
cidr_blocks = [“0.0.0.0/0”]
}
egress {
from_port = 0
to_port = 0
protocol = “-1”
cidr_blocks = [“0.0.0.0/0”]
}
}
# 初始化Terraform
# terraform init
Initializing the backend…
Initializing provider plugins…
– Finding latest version of hashicorp/aws…
– Installing hashicorp/aws v3.74.3…
– Installed hashicorp/aws v3.74.3 (signed by HashiCorp)
Terraform has been successfully initialized!
# 计划部署
# terraform plan
# 应用部署
# terraform apply
# 销毁资源
# terraform destroy
6. Jenkins CI/CD自动化
Jenkins是一种持续集成和持续部署(CI/CD)工具,用于自动化软件构建、测试和部署。
# wget -O /etc/yum.repos.d/jenkins.repo https://pkg.jenkins.io/redhat-stable/jenkins.repo
# rpm –import https://pkg.jenkins.io/redhat-stable/jenkins.io.key
# yum install -y jenkins java-1.8.0-openjdk-devel
# 启动Jenkins
# systemctl start jenkins
# systemctl enable jenkins
# 访问Jenkins
# 打开浏览器,访问 http://fgedudb:8080
# 查看初始密码
# cat /var/lib/jenkins/secrets/initialAdminPassword
# 安装插件
# 1. 登录Jenkins
# 2. 点击”Install suggested plugins”
# 创建Pipeline任务
# 1. 点击”New Item”
# 2. 输入任务名称,选择”Pipeline”
# 3. 点击”OK”
# 4. 在”Pipeline”部分,选择”Pipeline script”
# 5. 输入Pipeline脚本
# Pipeline脚本示例
pipeline {
agent any
stages {
stage(‘Build’) {
steps {
echo ‘Building…’
sh ‘mkdir -p output’
sh ‘echo “Hello, World!” > output/index.html’
}
}
stage(‘Test’) {
steps {
echo ‘Testing…’
sh ‘ls -la output’
}
}
stage(‘Deploy’) {
steps {
echo ‘Deploying…’
sh ‘cp -r output /var/www/html/’
}
}
}
}
# 保存并构建
# 点击”Save”
# 点击”Build Now”
7. Git版本控制与自动化
Git是一种版本控制系统,用于跟踪代码变更,配合CI/CD工具实现自动化。
# yum install -y git
# 配置Git
# git config –global user.name “Your Name”
# git config –global user.email “your.email@fgedu.net.cn”
# 创建Git仓库
# mkdir project
# cd project
# git init
# 创建文件
# echo “# Project” > README.md
# git add README.md
# git commit -m “Initial commit”
# 创建分支
# git checkout -b feature-branch
# 修改文件
# echo “Feature content” >> README.md
# git add README.md
# git commit -m “Add feature”
# 合并分支
# git checkout master
# git merge feature-branch
# 推送到远程仓库
# git remote add origin https://github.com/username/project.git
# git push -u origin master
# Git Hooks自动化
# 创建pre-commit钩子
# vi .git/hooks/pre-commit
#!/bin/sh
# 运行代码检查
if command -v pylint > /dev/null; then
echo “Running pylint…”
pylint *.py
if [ $? -ne 0 ]; then
echo “Pylint failed, commit aborted”
exit 1
fi
fi
echo “Pre-commit hook passed”
exit 0
# 添加执行权限
# chmod +x .git/hooks/pre-commit
8. 监控自动化
监控自动化用于自动监控系统状态,及时发现和解决问题。
# 创建自动发现配置
# vi prometheus.yml
scrape_configs:
– job_name: ‘node’
file_sd_configs:
– files: [‘/etc/prometheus/targets.json’]
# 创建目标文件
# vi /etc/prometheus/targets.json
[
{
“targets”: [“192.168.1.10:9100”],
“labels”: {
“env”: “production”,
“role”: “web”
}
},
{
“targets”: [“192.168.1.11:9100”],
“labels”: {
“env”: “production”,
“role”: “web”
}
},
{
“targets”: [“192.168.1.20:9100”],
“labels”: {
“env”: “production”,
“role”: “db”
}
}
]
# 创建告警规则
# vi /etc/prometheus/alert.rules
groups:
– name: system-alerts
rules:
– alert: HighCPUUsage
expr: 100 – (avg by(instance) (irate(node_cpu_seconds_total{mode=”idle”}[5m])) > 80
for: 5m
labels:
severity: warning
annotations:
summary: “High CPU Usage on {{ $labels.instance }}”
description: “CPU usage is above 80% for 5 minutes”
# 重启Prometheus
# systemctl restart prometheus
# 创建自动修复脚本
# vi auto-fix.sh
#!/bin/bash
# 检查CPU使用率
CPU_USAGE=$(top -bn1 | grep “Cpu(s)” | sed “s/.*, *\([0-9.]*\)%* id.*/\1/” | awk ‘{print 100 – $1}’)
if (( $(echo “$CPU_USAGE > 90” | bc -l) )); then
echo “High CPU usage detected: $CPU_USAGE%”
# 查找占用CPU的进程
TOP_PROCESS=$(ps aux –sort=-%cpu | head -2 | tail -1)
echo “Top process: $TOP_PROCESS”
# 发送告警
echo “High CPU usage alert: $CPU_USAGE%” | mail -s “CPU Alert” admin@fgedu.net.cn
fi
# 添加到crontab
# crontab -e
*/5 * * * * /path/to/auto-fix.sh
9. 安全自动化
安全自动化用于自动执行安全检查和响应安全事件。
# vi security-scan.sh
#!/bin/bash
# 检查系统更新
echo “Checking for system updates…”
yum check-update
# 检查防火墙状态
echo “Checking firewall status…”
systemctl status firewalld
# 检查SELinux状态
echo “Checking SELinux status…”
getenforce
# 检查用户账户
echo “Checking user accounts…”
awk -F: ‘$3 == 0 {print $1}’ /etc/passwd
# 检查SSH配置
echo “Checking SSH configuration…”
grep -E “PermitRootLogin|PasswordAuthentication” /etc/ssh/sshd_config
# 检查开放端口
echo “Checking open ports…”
netstat -tuln
# 检查磁盘使用
echo “Checking disk usage…”
df -h
# 执行脚本
# chmod +x security-scan.sh
# ./security-scan.sh
# 创建安全事件响应脚本
# vi security-response.sh
#!/bin/bash
# 检测异常登录
echo “Checking for failed login attempts…”
FAILED_LOGINS=$(grep “Failed password” /var/log/secure | tail -10)
if [ ! -z “$FAILED_LOGINS” ]; then
echo “Failed login attempts detected:”
echo “$FAILED_LOGINS”
# 提取IP地址
IP_ADDRESSES=$(echo “$FAILED_LOGINS” | grep -oP ‘\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b’ | sort | uniq)
# 屏蔽可疑IP
for IP in $IP_ADDRESSES; do
echo “Blocking IP: $IP”
iptables -A INPUT -s $IP -j DROP
done
# 保存iptables规则
service iptables save
# 发送告警
echo “Security alert: Failed login attempts detected” | mail -s “Security Alert” security@fgedu.net.cn
fi
# 添加到crontab
# crontab -e
*/10 * * * * /path/to/security-response.sh
10. 自动化最佳实践
以下是系统自动化的一些最佳实践。
1. 从小处开始:从简单的任务开始自动化,逐步扩展
2. 版本控制:使用Git等版本控制系统管理自动化脚本
3. 文档完善:为自动化脚本和流程编写详细的文档
4. 测试验证:在生产环境之前测试自动化脚本
5. 错误处理:在脚本中添加适当的错误处理和日志记录
6. 模块化设计:将复杂的自动化任务分解为模块
7. 安全性:确保自动化脚本和工具的安全性
8. 监控:监控自动化任务的执行状态
9. 定期审查:定期审查和更新自动化脚本
10. 持续改进:根据反馈不断改进自动化流程
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
