本文详细介绍Hadoop大数据安全架构实战,包括Kerberos认证、Ranger/Sentry授权、HDFS/Hive/Kafka安全配置等内容,参考Apache Kerberos、Ranger、Sentry官方文档,适合大数据安全工程师和运维工程师使用。更多视频教程www.fgedu.net.cn
Part01-基础概念与理论知识
1.1 大数据安全概述
大数据安全主要包括认证(Authentication)、授权(Authorization)、审计(Audit)、加密(Encryption)四大方面。学习交流加群风哥微信: itpux-com
- 认证:确认用户身份
- 授权:控制用户权限
- 审计:记录用户操作
- 加密:保护数据传输和存储
1.2 大数据安全架构
大数据安全架构:
认证层:
– Kerberos
– LDAP
– SSO
– OAuth
授权层:
– Apache Ranger
– Apache Sentry
– HDFS ACL
– Hive授权
审计层:
– Ranger Audit
– HDFS Audit
– Hive Audit
– 自定义审计
加密层:
– TLS/SSL
– HDFS透明加密
– 列级加密
– 应用层加密
1.3 Kerberos认证
Kerberos是一个网络认证协议,使用票据(Ticket)来认证用户身份。更多学习教程公众号风哥教程itpux_com
Part02-生产环境规划与建议
2.1 安全架构规划
安全架构规划要点:
KDC服务器:
– 数量:2台(主备)
– 配置:4核8GB
– 组件:Kerberos KDC
Ranger服务器:
– 数量:2台(高可用)
– 配置:8核16GB
– 组件:Ranger Admin
数据库服务器:
– 数量:1-2台
– 组件:MySQL/PostgreSQL
– 用途:Ranger元数据、Audit日志
# 安全规划
认证:
– Kerberos认证
– LDAP用户管理
– SSO集成
授权:
– Ranger统一授权
– 最小权限原则
– 定期权限审计
审计:
– Ranger Audit
– 日志长期保存
– 定期审计报告
加密:
– TLS/SSL传输加密
– HDFS透明加密
– 敏感数据加密
2.2 安全区域划分
安全区域划分:
- DMZ区:对外服务,API网关
- 应用区:应用服务,BI报表
- 大数据区:Hadoop集群
- 管理区:管理服务,监控告警
- 数据库区:元数据库,业务数据库
学习交流加群风哥QQ113257174
2.3 安全规范
安全规范:
用户账号:
– 命名:姓名全拼/工号
– 密码:复杂度要求,定期更换
– 离职:及时删除
服务账号:
– 命名:hdfs/hive/yarn等
– 密码:复杂密码
– 定期更换
# 权限规范
最小权限原则:
– 只授予必要权限
– 定期审核权限
– 及时回收权限
# 审计规范
审计日志:
– 保存时间:6个月以上
– 定期备份
– 定期审计
Part03-生产环境项目实施方案
3.1 Kerberos部署实战
3.1.1 Kerberos KDC部署
yum install -y krb5-server krb5-libs krb5-workstation
# 2. 配置krb5.conf
cat > /etc/krb5.conf << ‘EOF’
[libdefaults]
default_realm = FGEDU.NET.CN
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
[realms]
FGEDU.NET.CN = {
kdc = fgedu-kdc01.fgedu.net.cn
kdc = fgedu-kdc02.fgedu.net.cn
admin_server = fgedu-kdc01.fgedu.net.cn
}
[domain_realm]
.fgedu.net.cn = FGEDU.NET.CN
fgedu.net.cn = FGEDU.NET.CN
EOF
# 3. 配置kdc.conf
cat > /var/kerberos/krb5kdc/kdc.conf << ‘EOF’
[kdcdefaults]
kdc_ports = 88
kdc_tcp_ports = 88
[realms]
FGEDU.NET.CN = {
acl_file = /var/kerberos/krb5kdc/kadm5.acl
dict_file = /usr/share/dict/words
admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
}
EOF
# 4. 配置kadm5.acl
cat > /var/kerberos/krb5kdc/kadm5.acl << ‘EOF’
*/admin@FGEDU.NET.CN *
EOF
# 5. 创建Kerberos数据库
kdb5_util create -s -r FGEDU.NET.CN
# 输入密码:fgedu123
# 6. 创建管理员principal
kadmin.local -q “addprinc admin/admin”
# 输入密码:fgedu123
# 7. 启动Kerberos
systemctl start krb5kdc
systemctl start kadmin
systemctl enable krb5kdc
systemctl enable kadmin
# 8. 创建服务principal
kadmin.local -q “addprinc -randkey hdfs/fgedu-nn@FGEDU.NET.CN”
kadmin.local -q “addprinc -randkey yarn/fgedu-rm@FGEDU.NET.CN”
kadmin.local -q “addprinc -randkey hive/fgedu-hs2@FGEDU.NET.CN”
kadmin.local -q “addprinc -randkey HTTP/fgedu-nn@FGEDU.NET.CN”
# 9. 创建keytab
kadmin.local -q “xst -k hdfs.keytab hdfs/fgedu-nn”
kadmin.local -q “xst -k yarn.keytab yarn/fgedu-rm”
kadmin.local -q “xst -k hive.keytab hive/fgedu-hs2”
kadmin.local -q “xst -k HTTP.keytab HTTP/fgedu-nn”
# 10. 验证keytab
kinit -kt hdfs.keytab hdfs/fgedu-nn@FGEDU.NET.CN
klist
3.2 Ranger部署实战
3.2.1 Ranger Admin部署
cd /bigdata/app
wget https://archive.apache.org/dist/ranger/2.4.0/ranger-2.4.0-admin.tar.gz
tar -zxvf ranger-2.4.0-admin.tar.gz
ln -s ranger-2.4.0-admin ranger-admin
# 2. 创建数据库
mysql -u root -p
CREATE DATABASE ranger DEFAULT CHARACTER SET utf8;
CREATE USER ‘ranger’@’%’ IDENTIFIED BY ‘fgedu123’;
GRANT ALL PRIVILEGES ON ranger.* TO ‘ranger’@’%’;
FLUSH PRIVILEGES;
# 3. 配置install.properties
cd /bigdata/app/ranger-admin
vi install.properties
# 关键配置
db_root_user=root
db_root_password=fgedu123
db_host=fgedu-mysql
db_name=ranger
db_user=ranger
db_password=fgedu123
rangerAdmin_host=fgedu-ranger01
rangerAdmin_httpPort=6080
rangerAdmin_httpsPort=6182
# 4. 安装Ranger Admin
./setup.sh
# 5. 启动Ranger Admin
ranger-admin start
# 6. 访问Ranger
# http://fgedu-ranger01:6080
# 默认用户名: admin
# 默认密码: admin
# 7. 安装Ranger HDFS Plugin
cd /bigdata/app
wget https://archive.apache.org/dist/ranger/2.4.0/ranger-2.4.0-hdfs-plugin.tar.gz
tar -zxvf ranger-2.4.0-hdfs-plugin.tar.gz
ln -s ranger-2.4.0-hdfs-plugin ranger-hdfs-plugin
cd /bigdata/app/ranger-hdfs-plugin
vi install.properties
# 配置
POLICY_MGR_URL=http://fgedu-ranger01:6080
REPOSITORY_NAME=fgedu_hdfs
XAAUDIT.SOLR.ENABLE=false
XAAUDIT.HDFS.ENABLE=true
XAAUDIT.HDFS.HDFS_DIR=hdfs://fgedu-nn:8020/bigdata/fgdata/ranger/audit
HDFS_HOME=/bigdata/app/hadoop
HDFS_CONF_DIR=/bigdata/app/hadoop/etc/hadoop
./enable-hdfs-plugin.sh
# 8. 重启HDFS
su – hdfs
hdfs –daemon stop namenode
hdfs –daemon start namenode
hdfs –daemon stop datanode
hdfs –daemon start datanode
# 9. 在Ranger中创建HDFS Service
# Ranger Web UI
# Access Manager -> Resource Based Policies -> Add New Service
# Service Name: fgedu_hdfs
# Username: hdfs
# NameNode URL: hdfs://fgedu-nn:8020
3.3 Sentry部署实战
3.3.1 Sentry部署
cd /bigdata/app
wget https://archive.apache.org/dist/sentry/2.1.0/apache-sentry-2.1.0-bin.tar.gz
tar -zxvf apache-sentry-2.1.0-bin.tar.gz
ln -s apache-sentry-2.1.0-bin sentry
# 2. 创建数据库
mysql -u root -p
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8;
CREATE USER ‘sentry’@’%’ IDENTIFIED BY ‘fgedu123’;
GRANT ALL PRIVILEGES ON sentry.* TO ‘sentry’@’%’;
FLUSH PRIVILEGES;
# 3. 配置sentry-site.xml
cat > /bigdata/app/sentry/conf/sentry-site.xml << ‘EOF’
<?xml version=”1.0″?>
<configuration>
<property>
<name>sentry.service.client.server.rpc-address</name>
<value>fgedu-sentry01</value>
</property>
<property>
<name>sentry.service.client.server.rpc-port</name>
<value>8038</value>
</property>
<property>
<name>sentry.store.jdbc.url</name>
<value>jdbc:mysql://fgedu-mysql:3306/sentry</value>
</property>
<property>
<name>sentry.store.jdbc.driver</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>sentry.store.jdbc.user</name>
<value>sentry</value>
</property>
<property>
<name>sentry.store.jdbc.password</name>
<value>fgedu123</value>
</property>
<property>
<name>sentry.verify.schema.metrics</name>
<value>false</value>
</property>
</configuration>
EOF
# 4. 初始化数据库
schematool -dbType mysql -initSchema
# 5. 启动Sentry
sentry –daemon start service
# 6. 配置Hive使用Sentry
# 在hive-site.xml中添加
<property>
<name>hive.security.authorization.task.factory</name>
<value>org.apache.sentry.binding.hive.SentryHiveAuthorizationTaskFactoryImpl</value>
</property>
<property>
<name>hive.security.authorization.manager</name>
<value>org.apache.sentry.binding.hive.conf.HiveAuthzConf</value>
</property>
<property>
<name>hive.sentry.conf.url</name>
<value>file:///bigdata/app/sentry/conf/sentry-site.xml</value>
</property>
# 7. 重启Hive
hive –service hiveserver2 stop
hive –service hiveserver2 start
# 8. 使用Sentry授权
# Hive中执行
CREATE ROLE fgedu_analyst;
GRANT ROLE fgedu_analyst TO GROUP fgedu;
GRANT SELECT ON DATABASE fgedu_db TO ROLE fgedu_analyst;
Part04-生产案例与实战讲解
4.1 HDFS安全配置实战
4.1.1 HDFS Kerberos认证
<property>
<name>dfs.permissions.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.encryption.key.provider.uri</name>
<value>kms://http@fgedu-kms:9600/kms</value>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>hdfs/_HOST@FGEDU.NET.CN</value>
</property>
<property>
<name>dfs.namenode.keytab.file</name>
<value>/etc/security/keytabs/hdfs.keytab</value>
</property>
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>hdfs/_HOST@FGEDU.NET.CN</value>
</property>
<property>
<name>dfs.datanode.keytab.file</name>
<value>/etc/security/keytabs/hdfs.keytab</value>
</property>
— 2. 配置core-site.xml
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
— 3. HDFS ACL配置
hdfs dfs -setfacl -m user:fgedu:rwx /bigdata/fgdata
hdfs dfs -setfacl -m group:fgedu:r-x /bigdata/fgdata
hdfs dfs -getfacl /bigdata/fgdata
— 4. HDFS透明加密
hadoop key create fgedu_key -size 256
hdfs crypto -createZone -keyName fgedu_key -path /bigdata/fgdata/encrypted
hdfs crypto -listZones
— 5. Ranger HDFS授权
# Ranger Web UI
# 创建Policy
# Resource Path: /bigdata/fgdata/*
# User: fgedu
# Permissions: Read, Write, Execute
4.2 Hive安全配置实战
4.2.1 Hive安全配置
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.security.authentication.manager</name>
<value>org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive/_HOST@FGEDU.NET.CN</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/etc/security/keytabs/hive.keytab</value>
</property>
— 2. Ranger Hive授权
# Ranger Web UI
# 创建Policy
# Database: fgedu_db
# Table: *
# Column: *
# User: fgedu_analyst
# Permissions: Select, Create, Drop
— 3. Hive列级权限
# Ranger中配置
# Column: user_id, name, phone
# 只允许访问指定列
— 4. Hive行级过滤
# Ranger中配置Row Filter
# WHERE city = ‘北京’
— 5. Hive数据脱敏
# Ranger中配置Data Mask
# phone: xxx-xxxx-xxxx
4.3 Kafka安全配置实战
4.3.1 Kafka安全配置
listeners=SASL_PLAINTEXT://:9092
security.inter.broker.protocol=SASL_PLAINTEXT
sasl.mechanism.inter.broker.protocol=PLAIN
sasl.enabled.mechanisms=PLAIN
allow.everyone.if.no.acl.found=false
— 2. 配置JAAS文件
cat > /bigdata/app/kafka/config/kafka_server_jaas.conf << ‘EOF’
KafkaServer {
org.apache.kafka.common.security.plain.PlainLoginModule required
username=”kafka”
password=”fgedu123″
user_kafka=”fgedu123″
user_fgedu=”fgedu123″;
};
EOF
— 3. 配置KAFKA_OPTS
export KAFKA_OPTS=”-Djava.security.auth.login.config=/bigdata/app/kafka/config/kafka_server_jaas.conf”
— 4. 启动Kafka
kafka-server-start.sh -daemon /bigdata/app/kafka/config/server.properties
— 5. 创建Topic
kafka-topics.sh –create –topic fgedu_user_events –partitions 3 –replication-factor 2 –bootstrap-server fgedu-kafka01:9092 –command-config client.properties
— 6. 配置ACL
kafka-acls.sh –authorizer kafka.security.auth.SimpleAclAuthorizer –authorizer-properties zookeeper.connect=fgedu-zk:2181 –add –allow-principal User:fgedu –operation Read –operation Write –topic fgedu_user_events
— 7. 查看ACL
kafka-acls.sh –authorizer kafka.security.auth.SimpleAclAuthorizer –authorizer-properties zookeeper.connect=fgedu-zk:2181 –list –topic fgedu_user_events
— 8. 配置client.properties
cat > client.properties << ‘EOF’
security.protocol=SASL_PLAINTEXT
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username=”fgedu” password=”fgedu123″;
EOF
— 9. 生产者测试
kafka-console-producer.sh –broker-list fgedu-kafka01:9092 –topic fgedu_user_events –producer.config client.properties
— 10. 消费者测试
kafka-console-consumer.sh –bootstrap-server fgedu-kafka01:9092 –topic fgedu_user_events –from-beginning –consumer.config client.properties
Part05-风哥经验总结与分享
5.1 安全最佳实践
安全最佳实践:
- 最小权限:只授予必要权限
- 定期审计:定期审计权限和操作日志
- 密码管理:定期更换密码,使用复杂密码
- 传输加密:使用TLS/SSL加密传输
- 存储加密:敏感数据加密存储
- 监控告警:安全事件及时告警
5.2 常见问题处理
– 检查keytab权限
– 检查时间同步
– 检查principal是否正确
– 重新生成keytab
– 查看日志
# 常见问题2:权限不足
– 检查Ranger/Sentry策略
– 检查HDFS权限
– 检查用户组
– 重新配置权限
– 查看审计日志
# 常见问题3:SSL/TLS配置失败
– 检查证书
– 检查密钥
– 检查配置
– 查看日志
– 重新生成证书
# 常见问题4:审计日志丢失
– 检查审计配置
– 检查存储路径
– 检查权限
– 检查日志轮转
– 查看服务日志
# 常见问题5:用户无法登录
– 检查认证配置
– 检查用户账号
– 检查密码
– 检查账号状态
– 查看认证日志
5.3 安全检查清单
– [ ] Kerberos正常运行
– [ ] Ranger/Sentry正常运行
– [ ] HDFS权限配置正确
– [ ] Hive权限配置正确
– [ ] Kafka权限配置正确
– [ ] 审计日志正常
– [ ] SSL/TLS配置正确
– [ ] 密码符合复杂度要求
– [ ] 定期权限审计
– [ ] 日志定期备份
– [ ] 监控告警正常
– [ ] 防火墙配置正确
# 日常巡检内容
1. 检查认证服务状态
2. 检查授权服务状态
3. 检查审计日志
4. 检查权限配置
5. 检查SSL/TLS证书
6. 检查安全告警
7. 检查防火墙规则
8. 检查账号状态
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
