1. 首页 > 软件安装教程 > 正文

Azkaban安装配置-Azkaban工作流安装配置_升级迁移详细过程

1. Azkaban概述与环境规划

Azkaban是LinkedIn开源的工作流调度系统,用于管理和调度Hadoop生态系统中的各种作业。Azkaban具有易于使用的Web界面,支持工作流的可视化定义、调度和监控。更多学习教程www.fgedu.net.cn

1.1 Azkaban版本说明

Azkaban目前主要版本为3.x系列,本教程以Azkaban 3.91.0为例进行详细讲解。Azkaban
3.x版本相比之前版本在性能、稳定性和功能方面都有显著提升,支持更多的作业类型和更灵活的工作流定义。

# 查看Azkaban版本
$ azkaban-web-server-3.91.0/bin/azkaban-web-shutdown.sh –version
Azkaban Web Server version: 3.91.0

# 查看Java版本
$ java -version
openjdk version “1.8.0_402”
OpenJDK Runtime Environment (build 1.8.0_402-b06)
OpenJDK 64-Bit Server VM (build 25.402-b06, mixed mode)

# 查看MySQL版本
$ mysql –version
mysql Ver 8.0.33 for Linux on x86_64 (MySQL Community Server – GPL)

1.2 环境规划

本次安装环境规划如下:

Azkaban服务器:
azkaban01.fgedu.net.cn (192.168.1.51) – Web服务器 + 执行服务器
azkaban02.fgedu.net.cn (192.168.1.52) – 执行服务器(可选)

Azkaban版本:3.91.0
Java版本:OpenJDK 1.8.0
MySQL版本:8.0.33
安装目录:/data/azkaban
Web服务器端口:8443
执行服务器端口:12321

数据库:
MySQL:192.168.1.51:3306/azkaban

存储:
工作流存储:/data/azkaban/projects
日志存储:/data/azkaban/logs

2. 硬件环境要求

Azkaban作为工作流调度系统,对硬件资源要求相对较低,但需要考虑调度作业的数量和复杂度。学习交流加群风哥微信: itpux-com

2.1 物理主机环境要求

# 检查内存大小
# free -h
total used free shared buff/cache available
Mem: 16G 4.2G 10G 256M 1.8G 11G
Swap: 8G 0B 8G

# 检查磁盘空间
# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 50G 12G 39G 24% /
/dev/sdb1 500G 50G 451G 10% /data
/dev/sdc1 200G 20G 181G 10% /backup

# 检查CPU核心数
# nproc
8

# 检查系统架构
# uname -m
x86_64

生产环境建议:最小内存8GB(测试环境),生产环境建议16GB以上。磁盘空间根据工作流数量和日志大小规划,建议至少200GB。CPU核心数建议8核以上,以支持并发作业调度。

2.2 vSphere虚拟主机环境要求

虚拟机配置:
– vCPU:8核
– 内存:16GB
– 磁盘:系统盘50GB + 数据盘500GB
– 网络:VMXNET3网卡,千兆网络
– 存储:建议使用SSD存储以提高I/O性能

资源池配置:
– CPU预留:4GHz
– 内存预留:8GB
– 内存限制:16GB
– CPU份额:正常
– 内存份额:正常

2.3 云平台主机环境要求

云主机规格(阿里云/腾讯云/华为云):
– 实例规格:ecs.g6.2xlarge或同等规格
– vCPU:8核
– 内存:32GB
– 系统盘:高效云盘 100GB
– 数据盘:SSD云盘 500GB
– 网络带宽:10Mbps以上

存储配置:
– OSS对象存储:用于存储工作流定义和配置
– NAS文件存储:用于共享配置文件
– 云盘快照:定期备份配置和数据

3. 操作系统环境准备

在安装Azkaban之前,需要对操作系统进行必要的配置和优化。

3.1 操作系统版本检查

# 检查操作系统版本
# cat /etc/os-release
NAME=”Oracle Linux Server”
VERSION=”8.9″
ID=”ol”
PRETTY_NAME=”Oracle Linux Server 8.9″

# 检查内核版本
# uname -r
5.4.17-2136.302.7.2.el8uek.x86_64

# 检查SELinux状态
# getenforce
Disabled

# 检查防火墙状态
# systemctl status firewalld
● firewalld.service – firewalld – dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead)

3.2 内核参数优化

# 编辑sysctl.conf文件
# vi /etc/sysctl.conf

# 添加以下内核参数
fs.file-max = 6815744
kernel.sem = 250 32000 100 128
kernel.shmmni = 4096
kernel.shmall = 4294967296
kernel.shmmax = 68719476736
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_max_syn_backlog = 8192
net.core.somaxconn = 1024
vm.swappiness = 10
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5

# 使内核参数生效
# sysctl -p

# 验证参数设置
# sysctl -a | grep fs.file-max
fs.file-max = 6815744

3.3 用户资源限制配置

# 配置用户资源限制
# vi /etc/security/limits.conf

# 添加以下内容
* soft nproc 65535
* hard nproc 65535
* soft nofile 65535
* hard nofile 65535
* soft stack 10240
* hard stack 32768
azkaban soft memlock unlimited
azkaban hard memlock unlimited

# 验证配置
# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 63499
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 65535
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 65535
virtual memory (kbytes, -v) unlimited

3.4 Java环境安装

# 安装OpenJDK 1.8
# yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel

# 配置Java环境变量
# vi /etc/profile.d/java.sh

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

# 使环境变量生效
# source /etc/profile.d/java.sh

# 验证Java安装
# java -version
openjdk version “1.8.0_402”
OpenJDK Runtime Environment (build 1.8.0_402-b06)
OpenJDK 64-Bit Server VM (build 25.402-b06, mixed mode)

# 验证JAVA_HOME
# echo $JAVA_HOME
/usr/lib/jvm/java-1.8.0-openjdk

3.5 MySQL数据库准备

Azkaban需要使用MySQL数据库存储工作流定义和执行状态。学习交流加群风哥QQ113257174

# 安装MySQL
# yum install -y mysql-server mysql-client

# 启动MySQL服务
# systemctl start mysqld
# systemctl enable mysqld

# 配置MySQL
# mysql -u root -p

# 创建Azkaban数据库和用户
CREATE DATABASE azkaban CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
CREATE USER ‘azkaban’@’localhost’ IDENTIFIED BY ‘azkaban123’;
CREATE USER ‘azkaban’@’%’ IDENTIFIED BY ‘azkaban123’;
GRANT ALL PRIVILEGES ON azkaban.* TO ‘azkaban’@’localhost’;
GRANT ALL PRIVILEGES ON azkaban.* TO ‘azkaban’@’%’;
FLUSH PRIVILEGES;
EXIT;

# 验证数据库连接
# mysql -u azkaban -pazkaban123 -e “SELECT VERSION();”
+———–+
| VERSION() |
+———–+
| 8.0.33 |
+———–+

4. Azkaban安装配置

完成环境准备后,开始安装Azkaban。

4.1 下载Azkaban安装包

# 创建安装目录
# mkdir -p /data/azkaban
# mkdir -p /data/azkaban/logs
# mkdir -p /data/azkaban/projects

# 下载Azkaban
# cd /tmp
# wget https://github.com/azkaban/azkaban/archive/refs/tags/3.91.0.tar.gz

# 解压安装
# tar -xzf 3.91.0.tar.gz
# cd azkaban-3.91.0

# 编译Azkaban
# ./gradlew build -x test

# 查看编译结果
# ls -la azkaban-web-server/build/distributions/
total 81920
-rw-r–r– 1 root root 41943040 Apr 5 10:00 azkaban-web-server-3.91.0.tar.gz

# ls -la azkaban-exec-server/build/distributions/
total 73728
-rw-r–r– 1 root root 37748736 Apr 5 10:00 azkaban-exec-server-3.91.0.tar.gz

# ls -la azkaban-solo-server/build/distributions/
total 73728
-rw-r–r– 1 root root 37748736 Apr 5 10:00 azkaban-solo-server-3.91.0.tar.gz

4.2 安装Azkaban Web服务器

# 解压Web服务器
# cd /tmp/azkaban-3.91.0
# tar -xzf azkaban-web-server/build/distributions/azkaban-web-server-3.91.0.tar.gz -C
/data/azkaban/

# 解压执行服务器
# tar -xzf azkaban-exec-server/build/distributions/azkaban-exec-server-3.91.0.tar.gz -C
/data/azkaban/

# 查看安装目录
# ls -la /data/azkaban/
total 16
drwxr-xr-x 4 root root 4096 Apr 5 10:00 .
drwxr-xr-x 3 root root 4096 Apr 5 10:00 ..
drwxr-xr-x 6 root root 4096 Apr 5 10:00 azkaban-exec-server-3.91.0
drwxr-xr-x 6 root root 4096 Apr 5 10:00 azkaban-web-server-3.91.0
drwxr-xr-x 2 root root 4096 Apr 5 10:00 logs
drwxr-xr-x 2 root root 4096 Apr 5 10:00 projects

4.3 配置Azkaban数据库

# 初始化Azkaban数据库
# cd /data/azkaban/azkaban-exec-server-3.91.0
# mysql -u azkaban -pazkaban123 azkaban < sql/create-all-sql-0.1.0-SNAPSHOT.sql # 验证数据库表 # mysql -u azkaban -pazkaban123 -e "USE azkaban; SHOW TABLES;" | wc -l 24 # 查看数据库表结构 # mysql -u azkaban -pazkaban123 -e "USE azkaban; DESCRIBE execution_flows;" +------------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------------+--------------+------+-----+---------+----------------+ | exec_id | int(11) | NO | PRI | NULL | auto_increment | | flow_id | varchar(255) | NO | MUL | NULL | | | project_id | int(11) | NO | MUL | NULL | | | version | int(11) | NO | | NULL | | | submit_time | bigint(20) | NO | | NULL | | | start_time | bigint(20) | YES | | NULL | | | end_time | bigint(20) | YES | | NULL | | | status | varchar(50) | NO | | NULL | | | inputs | text | YES | | NULL | | | outputs | text | YES | | NULL | | | execution_options| text | YES | | NULL | | | submit_user | varchar(255) | NO | | NULL | | | update_time | bigint(20) | NO | | NULL | | +------------------+--------------+------+-----+---------+----------------+

4.4 配置Azkaban执行服务器

# 编辑执行服务器配置
# vi /data/azkaban/azkaban-exec-server-3.91.0/conf/azkaban.properties

# 基本配置
executor.port=12321
executor.maxThreads=50
executor.flow.threads=30
executor.queue.size=1000
executor.process.threads=30
executor.metric.reporter.enabled=true

# 数据库配置
database.type=mysql
mysql.port=3306
mysql.host=192.168.1.51
mysql.database=azkaban
mysql.user=azkaban
mysql.password=azkaban123
mysql.numconnections=100

# 执行器配置
executor.global.properties=/data/azkaban/azkaban-exec-server-3.91.0/conf/global.properties
executor.jobtype.plugin.dir=/data/azkaban/azkaban-exec-server-3.91.0/plugins/jobtypes

# 日志配置
azkaban.executor.log.dir=/data/azkaban/logs
executor.logging.logdir=/data/azkaban/logs

# 安全配置
executor.useHttps=false

# 验证配置
# cat /data/azkaban/azkaban-exec-server-3.91.0/conf/azkaban.properties | grep -v
“^#” | grep -v “^$”
executor.port=12321
executor.maxThreads=50
executor.flow.threads=30
executor.queue.size=1000
executor.process.threads=30
executor.metric.reporter.enabled=true
database.type=mysql
mysql.port=3306
mysql.host=192.168.1.51
mysql.database=azkaban
mysql.user=azkaban
mysql.password=azkaban123
mysql.numconnections=100
executor.global.properties=/data/azkaban/azkaban-exec-server-3.91.0/conf/global.properties
executor.jobtype.plugin.dir=/data/azkaban/azkaban-exec-server-3.91.0/plugins/jobtypes
azkaban.executor.log.dir=/data/azkaban/logs
executor.logging.logdir=/data/azkaban/logs
executor.useHttps=false

4.5 配置Azkaban Web服务器

# 编辑Web服务器配置
# vi /data/azkaban/azkaban-web-server-3.91.0/conf/azkaban.properties

# 基本配置
web.resource.dir=/data/azkaban/azkaban-web-server-3.91.0/web
web.log.dir=/data/azkaban/logs
web.port=8443
web.useHttps=true
web.ssl.port=8443
web.keystore.file=/data/azkaban/azkaban-web-server-3.91.0/keystore
web.keystore.password=azkaban

# 数据库配置
database.type=mysql
mysql.port=3306
mysql.host=192.168.1.51
mysql.database=azkaban
mysql.user=azkaban
mysql.password=azkaban123
mysql.numconnections=100

# 用户管理
user.manager.class=azkaban.user.XmlUserManager
user.manager.xml.file=/data/azkaban/azkaban-web-server-3.91.0/conf/azkaban-users.xml

# 执行服务器管理
executor.service.host=localhost
executor.service.port=12321
executor.service.ssl.enabled=false

# 验证配置
# cat /data/azkaban/azkaban-web-server-3.91.0/conf/azkaban.properties | grep -v
“^#” | grep -v “^$”
web.resource.dir=/data/azkaban/azkaban-web-server-3.91.0/web
web.log.dir=/data/azkaban/logs
web.port=8443
web.useHttps=true
web.ssl.port=8443
web.keystore.file=/data/azkaban/azkaban-web-server-3.91.0/keystore
web.keystore.password=azkaban
database.type=mysql
mysql.port=3306
mysql.host=192.168.1.51
mysql.database=azkaban
mysql.user=azkaban
mysql.password=azkaban123
mysql.numconnections=100
user.manager.class=azkaban.user.XmlUserManager
user.manager.xml.file=/data/azkaban/azkaban-web-server-3.91.0/conf/azkaban-users.xml
executor.service.host=localhost
executor.service.port=12321
executor.service.ssl.enabled=false

4.6 生成SSL证书

# 生成SSL证书
# cd /data/azkaban/azkaban-web-server-3.91.0
# keytool -keystore keystore -alias jetty -genkey -keyalg RSA

# 按照提示输入信息
Enter keystore password: azkaban
Re-enter new password: azkaban
What is your first and last name?
[Unknown]: azkaban
What is the name of your organizational unit?
[Unknown]: fgedu
What is the name of your organization?
[Unknown]: fgedu
What is the name of your City or Locality?
[Unknown]: Beijing
What is the name of your State or Province?
[Unknown]: Beijing
What is the two-letter country code for this unit?
[Unknown]: CN
Is CN=azkaban, OU=fgedu, O=fgedu, L=Beijing, ST=Beijing, C=CN correct?
[no]: yes
Enter key password for
(RETURN if same as keystore password):

# 验证证书生成
# ls -la keystore
-rw-r–r– 1 root root 2253 Apr 5 10:00 keystore

4.7 配置用户管理

# 编辑用户配置
# vi /data/azkaban/azkaban-web-server-3.91.0/conf/azkaban-users.xml

<azkaban-users>
<user groups=”admin” password=”admin” roles=”admin” username=”admin”/>
<user password=”azkaban” roles=”reader” username=”azkaban”/>
<user password=”metrics” roles=”metrics” username=”metrics”/>
<role name=”admin” permissions=”ADMIN”/>
<role name=”reader” permissions=”READ”/>
<role name=”metrics” permissions=”METRICS”/>
</azkaban-users>

# 验证用户配置
# cat /data/azkaban/azkaban-web-server-3.91.0/conf/azkaban-users.xml
<azkaban-users>
<user groups=”admin” password=”admin” roles=”admin” username=”admin”/>
<user password=”azkaban” roles=”reader” username=”azkaban”/>
<user password=”metrics” roles=”metrics” username=”metrics”/>
<role name=”admin” permissions=”ADMIN”/>
<role name=”reader” permissions=”READ”/>
<role name=”metrics” permissions=”METRICS”/>
</azkaban-users>

4.8 启动Azkaban服务

# 启动执行服务器
# cd /data/azkaban/azkaban-exec-server-3.91.0
# ./bin/start-exec.sh

# 输出案例如下:
Starting exec server on port 12321…
Exec server started successfully on port 12321

# 激活执行服务器
# curl -G “http://localhost:12321/executor?action=activate”
{“status”:”success”}

# 启动Web服务器
# cd /data/azkaban/azkaban-web-server-3.91.0
# ./bin/start-web.sh

# 输出案例如下:
Starting web server on port 8443…
Web server started successfully on port 8443

# 检查服务状态
# ps -ef | grep azkaban
root 12345 1 0 10:00 ? 00:00:00 java -Xms512M -Xmx512M
-Dlog4j.configuration=file:/data/azkaban/azkaban-exec-server-3.91.0/conf/log4j.properties
-cp /data/azkaban/azkaban-exec-server-3.91.0/lib/*:
azkaban.execapp.AzkabanExecutorServer
root 12346 1 0 10:00 ? 00:00:00 java -Xms512M -Xmx512M
-Dlog4j.configuration=file:/data/azkaban/azkaban-web-server-3.91.0/conf/log4j.properties
-cp /data/azkaban/azkaban-web-server-3.91.0/lib/*:
azkaban.webapp.AzkabanWebServer

# 查看日志
# tail -n 100 /data/azkaban/logs/azkaban-webserver.log
24/04/05 10:00:00 INFO [AzkabanWebServer] Starting Azkaban Web Server…
24/04/05 10:00:00 INFO [AzkabanWebServer] Azkaban Web Server started on port
8443
24/04/05 10:00:00 INFO [AzkabanWebServer] Web server running on
https://localhost:8443

风哥提示:Azkaban服务启动后,需要等待几分钟让所有服务完全初始化,然后才能正常使用Web控制台。

5. Azkaban配置优化

为了提高Azkaban的性能和稳定性,需要进行一些配置优化。

5.1 内存配置优化

# 编辑执行服务器内存配置
# vi /data/azkaban/azkaban-exec-server-3.91.0/bin/start-exec.sh

# 修改JVM内存参数
JAVA_OPTS=”-Xms2G -Xmx4G -XX:MaxPermSize=512m -XX:+UseConcMarkSweepGC
-XX:+CMSClassUnloadingEnabled -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/data/azkaban/logs”

# 编辑Web服务器内存配置
# vi /data/azkaban/azkaban-web-server-3.91.0/bin/start-web.sh

# 修改JVM内存参数
JAVA_OPTS=”-Xms2G -Xmx4G -XX:MaxPermSize=512m -XX:+UseConcMarkSweepGC
-XX:+CMSClassUnloadingEnabled -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/data/azkaban/logs”

# 验证配置
# grep JAVA_OPTS /data/azkaban/azkaban-exec-server-3.91.0/bin/start-exec.sh
JAVA_OPTS=”-Xms2G -Xmx4G -XX:MaxPermSize=512m -XX:+UseConcMarkSweepGC
-XX:+CMSClassUnloadingEnabled -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/data/azkaban/logs”

5.2 执行器配置优化

# 编辑执行服务器配置
# vi /data/azkaban/azkaban-exec-server-3.91.0/conf/azkaban.properties

# 优化执行器参数
executor.maxThreads=100
executor.flow.threads=50
executor.queue.size=2000
executor.process.threads=50
executor.flow.threadpool.core.size=20
executor.flow.threadpool.max.size=100
executor.flow.threadpool.keepalive.seconds=60
executor.cleanup.retry.interval.millis=30000
executor.cleanup.retry.count=5
executor.flow.lru.size=1000

5.3 数据库连接池优化

# 编辑执行服务器配置
# vi /data/azkaban/azkaban-exec-server-3.91.0/conf/azkaban.properties

# 优化数据库连接池
mysql.numconnections=200
mysql.pool.maxActive=200
mysql.pool.maxIdle=50
mysql.pool.minIdle=20
mysql.pool.maxWait=30000
mysql.pool.validationQuery=SELECT 1
mysql.pool.testOnBorrow=true
mysql.pool.testWhileIdle=true
mysql.pool.timeBetweenEvictionRunsMillis=30000

# 编辑Web服务器配置
# vi /data/azkaban/azkaban-web-server-3.91.0/conf/azkaban.properties

# 优化数据库连接池
mysql.numconnections=200
mysql.pool.maxActive=200
mysql.pool.maxIdle=50
mysql.pool.minIdle=20
mysql.pool.maxWait=30000
mysql.pool.validationQuery=SELECT 1
mysql.pool.testOnBorrow=true
mysql.pool.testWhileIdle=true
mysql.pool.timeBetweenEvictionRunsMillis=30000

5.4 日志配置优化

# 编辑log4j配置
# vi /data/azkaban/azkaban-exec-server-3.91.0/conf/log4j.properties

# 设置日志级别
log4j.rootLogger=INFO, file, console

# 文件输出配置
log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.File=/data/azkaban/logs/azkaban-executor.log
log4j.appender.file.MaxFileSize=100MB
log4j.appender.file.MaxBackupIndex=10
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p
%c{1}:%L – %m%n

# 控制台输出配置
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p
%c{1}:%L – %m%n

# 编辑Web服务器日志配置
# vi /data/azkaban/azkaban-web-server-3.91.0/conf/log4j.properties

# 设置日志级别
log4j.rootLogger=INFO, file, console

# 文件输出配置
log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.File=/data/azkaban/logs/azkaban-webserver.log
log4j.appender.file.MaxFileSize=100MB
log4j.appender.file.MaxBackupIndex=10
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p
%c{1}:%L – %m%n

# 控制台输出配置
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p
%c{1}:%L – %m%n

6. Azkaban Web控制台配置

Azkaban提供了Web控制台用于管理和监控工作流,下面详细介绍配置和使用方法。更多学习教程公众号风哥教程itpux_com

6.1 访问Web控制台

# 启动Azkaban服务
# cd /data/azkaban/azkaban-exec-server-3.91.0
# ./bin/start-exec.sh

# 激活执行服务器
# curl -G “http://localhost:12321/executor?action=activate”
{“status”:”success”}

# 启动Web服务器
# cd /data/azkaban/azkaban-web-server-3.91.0
# ./bin/start-web.sh

# 访问Web控制台
# 打开浏览器,访问 https://azkaban01.fgedu.net.cn:8443

# 验证Web控制台访问
# curl -k -I https://localhost:8443
HTTP/1.1 200 OK
Date: Fri, 05 Apr 2024 10:00:00 GMT
Content-Type: text/html;charset=UTF-8
Content-Length: 12345
Server: Jetty(9.4.43.v20210629)

6.2 Web控制台功能

Web控制台主要功能:
1. 项目管理:创建、上传、管理项目
2. 工作流管理:创建、编辑、执行、监控工作流
3. 执行历史:查看工作流执行历史和日志
4. 定时调度:设置工作流的定时执行
5. 用户管理:管理用户和权限
6. 系统管理:查看系统状态和配置

Web控制台登录:
– 用户名:admin
– 密码:admin

6.3 配置Web控制台安全

# 编辑用户配置
# vi /data/azkaban/azkaban-web-server-3.91.0/conf/azkaban-users.xml

<azkaban-users>
<user groups=”admin” password=”admin123″ roles=”admin” username=”admin”/>
<user groups=”users” password=”user123″ roles=”reader” username=”user”/>
<user groups=”developers” password=”dev123″ roles=”reader,writer”
username=”dev”/>
<role name=”admin” permissions=”ADMIN”/>
<role name=”reader” permissions=”READ”/>
<role name=”writer” permissions=”WRITE”/>
<role name=”metrics” permissions=”METRICS”/>
</azkaban-users>

# 配置HTTPS
# vi /data/azkaban/azkaban-web-server-3.91.0/conf/azkaban.properties

web.useHttps=true
web.ssl.port=8443
web.keystore.file=/data/azkaban/azkaban-web-server-3.91.0/keystore
web.keystore.password=azkaban

7. Azkaban工作流实战

本节介绍Azkaban工作流的创建和执行。

7.1 创建工作流定义

# 创建工作流目录
# mkdir -p /data/azkaban/projects/wordcount

# 创建flow20.xml
# vi /data/azkaban/projects/wordcount/flow20.xml

<?xml version=”1.0″ encoding=”UTF-8″?>
<azkaban-flow xmlns=”http://azkaban.github.io/azkaban-flow/2.0″>
<name>WordCount Flow</name>
<nodes>
<node name=”prepare-input” type=”command”>
<config>
<property name=”command”>hdfs dfs -mkdir -p
/user/azkaban/wordcount/input</property>
</config>
<edges>
<edge to=”copy-input” />
</edges>
</node>
<node name=”copy-input” type=”command”>
<config>
<property name=”command”>hdfs dfs -copyFromLocal -f
/data/azkaban/projects/wordcount/input.txt
/user/azkaban/wordcount/input/</property>
</config>
<edges>
<edge to=”wordcount” />
</edges>
</node>
<node name=”wordcount” type=”command”>
<config>
<property name=”command”>hadoop jar
/data/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar
wordcount /user/azkaban/wordcount/input
/user/azkaban/wordcount/output</property>
</config>
<edges>
<edge to=”check-output” />
</edges>
</node>
<node name=”check-output” type=”command”>
<config>
<property name=”command”>hdfs dfs -cat
/user/azkaban/wordcount/output/part-r-00000</property>
</config>
</node>
</nodes>
</azkaban-flow>

# 创建输入文件
# echo “hello world hello azkaban” > /data/azkaban/projects/wordcount/input.txt

# 创建项目属性文件
# vi /data/azkaban/projects/wordcount/project.properties

name=WordCount
version=1.0
description=WordCount Example

# 打包项目
# cd /data/azkaban/projects/wordcount
# zip -r wordcount.zip .

# 验证打包
# ls -la wordcount.zip
-rw-r–r– 1 root root 1024 Apr 5 10:00 wordcount.zip

7.2 上传和执行工作流

# 上传项目到Azkaban
# 打开Web控制台:https://azkaban01.fgedu.net.cn:8443
# 登录:用户名admin,密码admin
# 点击”Projects” -> “Create Project”
# 项目名称:WordCount
# 项目描述:WordCount Example
# 点击”Create”
# 点击”Upload” -> 选择wordcount.zip文件 -> 点击”Upload”

# 执行工作流
# 点击项目名称”WordCount”
# 点击工作流名称”WordCount Flow”
# 点击”Execute Flow”
# 点击”Submit”

# 监控工作流执行
# 点击”Executions” -> 查看执行状态
# 点击执行ID -> 查看详细日志

# 输出案例如下:
24/04/05 10:00:00 INFO [JobRunnerThread] Starting job prepare-input
24/04/05 10:00:01 INFO [JobRunnerThread] Job prepare-input succeeded
24/04/05 10:00:01 INFO [JobRunnerThread] Starting job copy-input
24/04/05 10:00:02 INFO [JobRunnerThread] Job copy-input succeeded
24/04/05 10:00:02 INFO [JobRunnerThread] Starting job wordcount
24/04/05 10:00:30 INFO [JobRunnerThread] Job wordcount succeeded
24/04/05 10:00:30 INFO [JobRunnerThread] Starting job check-output
24/04/05 10:00:31 INFO [JobRunnerThread] Job check-output succeeded
24/04/05 10:00:31 INFO [FlowRunner] Flow WordCount Flow completed successfully

# 验证输出结果
# hdfs dfs -cat /user/azkaban/wordcount/output/part-r-00000
hello 2
azkaban 1
world 1

7.3 创建定时工作流

# 编辑工作流配置
# vi /data/azkaban/projects/wordcount/flow20.xml

<?xml version=”1.0″ encoding=”UTF-8″?>
<azkaban-flow xmlns=”http://azkaban.github.io/azkaban-flow/2.0″>
<name>WordCount Flow</name>
<config>
<property name=”schedule”>0 0 * * *</property>
</config>
<nodes>

</nodes>
</azkaban-flow>

# 重新打包项目
# cd /data/azkaban/projects/wordcount
# zip -r wordcount.zip .

# 上传到Azkaban
# 打开Web控制台 -> 项目WordCount -> Upload -> 选择wordcount.zip -> Upload

# 设置定时调度
# 点击工作流名称”WordCount Flow”
# 点击”Schedule”
# 设置调度时间:每天 00:00
# 点击”Schedule”

# 查看调度计划
# 点击”Schedules” -> 查看调度计划
# 点击调度ID -> 查看详细信息

8. Azkaban性能优化

在生产环境中,需要对Azkaban进行性能优化以提高工作流调度效率。from:www.itpux.com

8.1 执行器优化

# 编辑执行服务器配置
# vi /data/azkaban/azkaban-exec-server-3.91.0/conf/azkaban.properties

# 优化执行器参数
executor.maxThreads=200
executor.flow.threads=100
executor.queue.size=5000
executor.process.threads=100
executor.flow.threadpool.core.size=50
executor.flow.threadpool.max.size=200
executor.flow.threadpool.keepalive.seconds=120
executor.cleanup.retry.interval.millis=60000
executor.cleanup.retry.count=3
executor.flow.lru.size=2000

8.2 数据库优化

# MySQL优化配置
# vi /etc/my.cnf

[mysqld]
binlog_format = ROW
innodb_buffer_pool_size = 2G
innodb_log_file_size = 512M
innodb_flush_log_at_trx_commit = 2
innodb_max_dirty_pages_pct = 75
innodb_file_per_table = 1
max_connections = 500
query_cache_size = 128M
query_cache_type = 1

# 重启MySQL服务
# systemctl restart mysqld

# 优化Azkaban数据库表
# mysql -u azkaban -pazkaban123 -e “USE azkaban; OPTIMIZE TABLE execution_flows;
OPTIMIZE TABLE execution_jobs; OPTIMIZE TABLE project_flows;”

# 查看数据库表大小
# mysql -u azkaban -pazkaban123 -e “USE azkaban; SELECT table_name,
round(((data_length + index_length) / 1024 / 1024), 2) as ‘Size (MB)’ FROM
information_schema.tables WHERE table_schema = ‘azkaban’ ORDER BY (data_length +
index_length) DESC LIMIT 10;”

8.3 内存优化

# 编辑执行服务器内存配置
# vi /data/azkaban/azkaban-exec-server-3.91.0/bin/start-exec.sh

# 根据服务器内存调整JVM参数
export JAVA_OPTS=”-Xms4G -Xmx8G -XX:MaxPermSize=1024m -XX:+UseG1GC
-XX:MaxGCPauseMillis=200 -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/data/azkaban/logs”

# 编辑Web服务器内存配置
# vi /data/azkaban/azkaban-web-server-3.91.0/bin/start-web.sh

# 根据服务器内存调整JVM参数
export JAVA_OPTS=”-Xms4G -Xmx8G -XX:MaxPermSize=1024m -XX:+UseG1GC
-XX:MaxGCPauseMillis=200 -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/data/azkaban/logs”

# 重启Azkaban服务
# cd /data/azkaban/azkaban-exec-server-3.91.0
# ./bin/shutdown-exec.sh
# ./bin/start-exec.sh

# 激活执行服务器
# curl -G “http://localhost:12321/executor?action=activate”

# 重启Web服务器
# cd /data/azkaban/azkaban-web-server-3.91.0
# ./bin/shutdown-web.sh
# ./bin/start-web.sh

8.4 工作流优化

# 工作流优化建议:
1. 减少工作流复杂度,避免过多的节点
2. 使用并行节点执行可独立的任务
3. 合理设置任务超时时间
4. 使用增量数据处理,避免全量处理
5. 优化任务执行参数
6. 合理设置调度时间,避免高峰期
7. 使用作业类型插件,提高执行效率
8. 定期清理历史执行数据

# 清理历史执行数据
# mysql -u azkaban -pazkaban123 -e “USE azkaban; DELETE FROM execution_flows
WHERE end_time < UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 30 DAY));" # 查看系统状态 # curl -G "http://localhost:12321/executor?action=getStats" {"stats":{"runningFlows":0,"queuedFlows":0,"completedFlows":1000,"failedFlows":100}}

生产环境建议:根据服务器硬件配置和工作流数量调整Azkaban的执行器参数和内存配置。定期清理历史执行数据,避免数据库表过大影响性能。合理设计工作流结构,减少不必要的节点。

9. Azkaban升级迁移

本节介绍Azkaban的版本升级和数据迁移方法。

9.1 Azkaban版本升级

# 备份当前Azkaban配置
# cp -r /data/azkaban/azkaban-web-server-3.91.0/conf
/backup/azkaban_web_conf_$(date +%Y%m%d)
# cp -r /data/azkaban/azkaban-exec-server-3.91.0/conf
/backup/azkaban_exec_conf_$(date +%Y%m%d)

# 备份Azkaban数据库
# mysqldump -u azkaban -pazkaban123 azkaban >
/backup/azkaban_db_$(date +%Y%m%d).sql

# 停止当前Azkaban服务
# cd /data/azkaban/azkaban-exec-server-3.91.0
# ./bin/shutdown-exec.sh

# cd /data/azkaban/azkaban-web-server-3.91.0
# ./bin/shutdown-web.sh

# 下载新版本Azkaban
# cd /tmp
# wget
https://github.com/azkaban/azkaban/archive/refs/tags/3.91.0.tar.gz

# 解压新版本
# tar -xzf 3.91.0.tar.gz
# cd azkaban-3.91.0

# 编译Azkaban
# ./gradlew build -x test

# 解压新的执行服务器
# tar -xzf
azkaban-exec-server/build/distributions/azkaban-exec-server-3.91.0.tar.gz
-C /data/azkaban/

# 解压新的Web服务器
# tar -xzf
azkaban-web-server/build/distributions/azkaban-web-server-3.91.0.tar.gz
-C /data/azkaban/

# 恢复配置文件
# cp -r /backup/azkaban_exec_conf_$(date +%Y%m%d)/*
/data/azkaban/azkaban-exec-server-3.91.0/conf/
# cp -r /backup/azkaban_web_conf_$(date +%Y%m%d)/*
/data/azkaban/azkaban-web-server-3.91.0/conf/

# 复制SSL证书
# cp /data/azkaban/azkaban-web-server-3.91.0/keystore
/data/azkaban/azkaban-web-server-3.91.0/

# 启动执行服务器
# cd /data/azkaban/azkaban-exec-server-3.91.0
# ./bin/start-exec.sh

# 激活执行服务器
# curl -G “http://localhost:12321/executor?action=activate”

# 启动Web服务器
# cd /data/azkaban/azkaban-web-server-3.91.0
# ./bin/start-web.sh

# 验证升级
# curl -k -I https://localhost:8443
HTTP/1.1 200 OK

# 验证服务状态
# ps -ef | grep azkaban

9.2 Azkaban配置迁移

# 导出Azkaban配置
# cp -r /data/azkaban/azkaban-web-server-3.91.0/conf
/backup/azkaban_web_conf_export
# cp -r /data/azkaban/azkaban-exec-server-3.91.0/conf
/backup/azkaban_exec_conf_export

# 导出项目文件
# cp -r /data/azkaban/projects /backup/azkaban_projects_export

# 在新服务器上导入配置
# cp -r /backup/azkaban_web_conf_export
/data/azkaban/azkaban-web-server-3.91.0/conf
# cp -r /backup/azkaban_exec_conf_export
/data/azkaban/azkaban-exec-server-3.91.0/conf

# 导入项目文件
# cp -r /backup/azkaban_projects_export /data/azkaban/

# 验证配置
# curl -k -I https://localhost:8443
HTTP/1.1 200 OK

# 验证项目
# ls -la /data/azkaban/projects/

10. Azkaban备份恢复

本节介绍Azkaban的备份和恢复方法。

10.1 Azkaban配置备份

# 备份Azkaban配置
# tar -czf /backup/azkaban_config_$(date +%Y%m%d).tar.gz -C
/data azkaban/azkaban-web-server-3.91.0/conf
azkaban/azkaban-exec-server-3.91.0/conf

# 备份项目文件
# tar -czf /backup/azkaban_projects_$(date +%Y%m%d).tar.gz -C
/data azkaban/projects

# 备份SSL证书
# cp /data/azkaban/azkaban-web-server-3.91.0/keystore
/backup/azkaban_keystore_$(date +%Y%m%d)

# 备份日志
# tar -czf /backup/azkaban_logs_$(date +%Y%m%d).tar.gz -C /data
azkaban/logs

10.2 Azkaban数据库备份

# 备份Azkaban数据库
# mysqldump -u azkaban -pazkaban123 azkaban >
/backup/azkaban_db_$(date +%Y%m%d).sql

# 压缩备份文件
# gzip /backup/azkaban_db_$(date +%Y%m%d).sql

# 验证备份文件
# ls -la /backup/azkaban_db_$(date +%Y%m%d).sql.gz
-rw-r–r– 1 root root 1234567 Apr 5 10:00
/backup/azkaban_db_20240405.sql.gz

# 定期备份脚本
# vi /data/azkaban/scripts/backup_azkaban.sh

#!/bin/bash
BACKUP_DIR=”/backup/azkaban_backups/$(date +%Y%m%d)”
AZKABAN_HOME=”/data/azkaban”

# 创建备份目录
mkdir -p $BACKUP_DIR

# 备份配置
cp -r $AZKABAN_HOME/azkaban-web-server-3.91.0/conf $BACKUP_DIR/
cp -r $AZKABAN_HOME/azkaban-exec-server-3.91.0/conf $BACKUP_DIR/

# 备份数据库
mysqldump -u azkaban -pazkaban123 azkaban >
$BACKUP_DIR/azkaban_db.sql
gzip $BACKUP_DIR/azkaban_db.sql

# 备份项目
cp -r $AZKABAN_HOME/projects $BACKUP_DIR/

# 备份SSL证书
cp $AZKABAN_HOME/azkaban-web-server-3.91.0/keystore $BACKUP_DIR/

# 备份日志
cp -r $AZKABAN_HOME/logs $BACKUP_DIR/

echo “Azkaban backup completed successfully: $BACKUP_DIR”

# 添加执行权限
# chmod +x /data/azkaban/scripts/backup_azkaban.sh

# 添加定时任务
# crontab -e
0 2 * * * /data/azkaban/scripts/backup_azkaban.sh

10.3 Azkaban恢复

# 停止Azkaban服务
# cd /data/azkaban/azkaban-exec-server-3.91.0
# ./bin/shutdown-exec.sh

# cd /data/azkaban/azkaban-web-server-3.91.0
# ./bin/shutdown-web.sh

# 恢复数据库
# mysql -u azkaban -pazkaban123 azkaban < /backup/azkaban_db_20240405.sql # 恢复配置 # cp -r /backup/azkaban_config_20240405/* /data/azkaban/azkaban-web-server-3.91.0/conf/ # cp -r /backup/azkaban_config_20240405/* /data/azkaban/azkaban-exec-server-3.91.0/conf/ # 恢复项目 # cp -r /backup/azkaban_projects_20240405/* /data/azkaban/projects/ # 恢复SSL证书 # cp /backup/azkaban_keystore_20240405 /data/azkaban/azkaban-web-server-3.91.0/keystore # 启动执行服务器 # cd /data/azkaban/azkaban-exec-server-3.91.0 # ./bin/start-exec.sh # 激活执行服务器 # curl -G "http://localhost:12321/executor?action=activate" # 启动Web服务器 # cd /data/azkaban/azkaban-web-server-3.91.0 # ./bin/start-web.sh # 验证恢复 # curl -k -I https://localhost:8443 HTTP/1.1 200 OK # 验证项目 # ls -la /data/azkaban/projects/

10.4 Azkaban监控脚本

# 创建Azkaban监控脚本
# vi /data/azkaban/scripts/azkaban_monitor.sh

#!/bin/bash
AZKABAN_HOME=”/data/azkaban”
LOG_FILE=”/data/azkaban/logs/azkaban_monitor.log”
ALERT_EMAIL=”admin@fgedu.net.cn”

check_executor_status() {
echo “$(date): Checking executor status…” >>
$LOG_FILE
status=$(curl -s
“http://localhost:12321/executor?action=getStats”)
if echo “$status” | grep -q “runningFlows”; then
echo “$(date): Executor status: OK” >> $LOG_FILE
else
echo “$(date): Executor status: FAILED” >>
$LOG_FILE
echo “Azkaban executor failed: $status” | mail
-s “Azkaban Alert” $ALERT_EMAIL
fi
}

check_web_status() {
echo “$(date): Checking web server status…” >>
$LOG_FILE
status=$(curl -k -s -o /dev/null -w
“%{http_code}” https://localhost:8443)
if [ “$status” -eq 200 ]; then
echo “$(date): Web server status: OK” >>
$LOG_FILE
else
echo “$(date): Web server status: FAILED” >>
$LOG_FILE
echo “Azkaban web server failed with status:
$status” | mail -s “Azkaban Alert” $ALERT_EMAIL
fi
}

check_database_connection() {
echo “$(date): Checking database connection…”
>> $LOG_FILE
mysql -u azkaban -pazkaban123 -e “SELECT 1” >
/dev/null 2>&1
if [ $? -eq 0 ]; then
echo “$(date): Database connection OK” >>
$LOG_FILE
else
echo “$(date): Database connection FAILED” >>
$LOG_FILE
echo “Azkaban database connection failed” | mail
-s “Azkaban Alert” $ALERT_EMAIL
fi
}

check_running_flows() {
echo “$(date): Checking running flows…” >>
$LOG_FILE
status=$(curl -s
“http://localhost:12321/executor?action=getStats”)
running_flows=$(echo $status | grep -o
‘”runningFlows”:[0-9]*’ | cut -d: -f2)
echo “$(date): Running flows: $running_flows” >>
$LOG_FILE
if [ “$running_flows” -gt 100 ]; then
echo “$(date): Too many running flows:
$running_flows” >> $LOG_FILE
echo “Azkaban has too many running flows:
$running_flows” | mail -s “Azkaban Alert”
$ALERT_EMAIL
fi
}

main() {
check_executor_status
check_web_status
check_database_connection
check_running_flows
}

main

# 添加执行权限
# chmod +x
/data/azkaban/scripts/azkaban_monitor.sh

# 添加定时任务
# crontab -e
*/15 * * * *
/data/azkaban/scripts/azkaban_monitor.sh

生产环境建议:定期备份Azkaban配置和数据库,建议每天执行一次完整备份。监控脚本建议每15分钟执行一次,及时发现并处理问题。恢复操作前务必停止Azkaban服务,避免数据不一致。

通过以上步骤,Azkaban安装配置、性能优化、升级迁移、备份恢复等内容已全部完成。Azkaban作为工作流调度系统,能够高效地协调和管理各种作业的执行,是大数据平台任务调度的重要组件之一。

本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html

联系我们

在线咨询:点击这里给我发消息

微信号:itpux-com

工作日:9:30-18:30,节假日休息