1. 首页 > 软件安装教程 > 正文

Presto/Trino安装配置-Presto/Trino大数据查询安装配置_升级迁移详细过程

1. Presto/Trino概述与环境规划

Trino(原PrestoSQL)是一款开源的分布式SQL查询引擎,专为大规模数据分析设计。它支持跨数据源的联邦查询,可以查询Hive、MySQL、PostgreSQL、Kafka等多种数据源。更多学习教程www.fgedu.net.cn

1.1 Trino版本说明

Trino目前主要版本为430,本教程以Trino 430为例进行详细讲解。

# 查看Trino版本
$ /data/trino/bin/trino –version
Trino 430

# 查看Java版本
$ java -version
openjdk version “17.0.8” 2023-07-18 LTS
OpenJDK Runtime Environment (build 17.0.8+9-LTS)
OpenJDK 64-Bit Server VM (build 17.0.8+9-LTS, mixed mode, sharing)

1.2 环境规划

本次安装环境规划如下:

Coordinator节点:
主机名:trino-coordinator.fgedu.net.cn
IP地址:192.168.1.51
端口:8080

Worker节点1:
主机名:trino-worker01.fgedu.net.cn
IP地址:192.168.1.52

Worker节点2:
主机名:trino-worker02.fgedu.net.cn
IP地址:192.168.1.53

Worker节点3:
主机名:trino-worker03.fgedu.net.cn
IP地址:192.168.1.54

Trino版本:430
Java版本:OpenJDK 17
安装目录:/data/trino
数据目录:/data/trino/data

1.3 Trino核心特性

主要特点:
1. 高性能:基于内存的分布式计算
2. 多数据源:支持Hive、MySQL、PostgreSQL等
3. 标准SQL:支持ANSI SQL标准
4. 联邦查询:跨数据源联合查询
5. 可扩展:支持水平扩展
6. 安全性:支持Kerberos、LDAP认证
7. 易用性:支持JDBC/ODBC连接

架构组件:
– Coordinator:协调节点,负责查询解析、优化、调度
– Worker:工作节点,负责实际计算
– Connector:连接器,连接各种数据源

2. 硬件环境要求与检查

在安装Trino之前,需要对服务器硬件环境进行全面检查。学习交流加群风哥微信: itpux-com

2.1 最低硬件要求

Coordinator节点最低配置:
CPU:4核心
内存:16GB
磁盘:50GB

Worker节点最低配置:
CPU:8核心
内存:32GB
磁盘:100GB

推荐配置(生产环境):
Coordinator:
CPU:8核心以上
内存:32GB以上
磁盘:100GB SSD

Worker:
CPU:16核心以上
内存:64GB以上
磁盘:500GB SSD

2.2 系统环境检查

# 检查操作系统版本
# cat /etc/redhat-release
Red Hat Enterprise Linux release 8.8 (Ootpa)

# 检查内核版本
# uname -r
4.18.0-477.27.1.el8_8.x86_64

# 检查内存信息
# free -h
total used free shared buff/cache available
Mem: 62Gi 2.0Gi 58Gi 256Mi 2.0Gi 59Gi
Swap: 31Gi 0B 31Gi

# 检查磁盘空间
# df -h /data
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/vg_data-lv_data 500G 50G 450G 10% /data

2.3 Java环境安装

# 安装OpenJDK 17
# yum install -y java-17-openjdk java-17-openjdk-devel

# 输出示例:
Installed:
java-17-openjdk-17.0.8.0.7-1.el8.x86_64
java-17-openjdk-devel-17.0.8.0.7-1.el8.x86_64

Complete!

# 配置JAVA_HOME
# vi /etc/profile.d/java.sh

export JAVA_HOME=/usr/lib/jvm/java-17-openjdk
export PATH=$JAVA_HOME/bin:$PATH

# 使配置生效
# source /etc/profile.d/java.sh

# 验证Java版本
$ java -version
openjdk version “17.0.8” 2023-07-18 LTS
OpenJDK Runtime Environment (build 17.0.8+9-LTS)
OpenJDK 64-Bit Server VM (build 17.0.8+9-LTS, mixed mode, sharing)

# 配置系统参数
# vi /etc/sysctl.d/99-trino.conf

vm.swappiness = 1
vm.max_map_count = 262144
fs.file-max = 2097152

# 使配置生效
# sysctl -p /etc/sysctl.d/99-trino.conf

3. Trino安装步骤

本节详细介绍Trino的安装过程。学习交流加群风哥QQ113257174

3.1 创建用户和目录

# 创建trino用户
# useradd -r -s /bin/bash trino

# 创建目录
# mkdir -p /data/trino
# mkdir -p /data/trino/data
# mkdir -p /data/trino/etc
# mkdir -p /data/trino/etc/catalog
# mkdir -p /var/log/trino

# 设置权限
# chown -R trino:trino /data/trino
# chown -R trino:trino /var/log/trino

3.2 下载安装Trino

# 下载Trino
# cd /data
# wget https://repo1.maven.org/maven2/io/trino/trino-server/430/trino-server-430.tar.gz

# 解压安装
# tar -xzf trino-server-430.tar.gz
# mv trino-server-430 trino
# chown -R trino:trino /data/trino

# 验证安装
$ ls -la /data/trino/

# 输出示例:
total 12
drwxr-xr-x. 6 trino trino 4096 Apr 4 10:00 .
drwxr-xr-x. 3 root root 22 Apr 4 10:00 ..
drwxr-xr-x. 2 trino trino 4096 Apr 4 10:00 bin
drwxr-xr-x. 2 trino trino 4096 Apr 4 10:00 lib
drwxr-xr-x. 2 trino trino 4096 Apr 4 10:00 plugin

# 下载Trino CLI
# wget https://repo1.maven.org/maven2/io/trino/trino-cli/430/trino-cli-430-executable.jar
# mv trino-cli-430-executable.jar /data/trino/bin/trino
# chmod +x /data/trino/bin/trino
# chown trino:trino /data/trino/bin/trino

3.3 配置Coordinator节点

# 配置config.properties
# vi /data/trino/etc/config.properties

coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
discovery.uri=http://192.168.1.51:8080
query.max-memory-per-node=8GB
query.max-memory=50GB
query.max-total-memory-per-node=10GB

# 配置jvm.config
# vi /data/trino/etc/jvm.config

-server
-Xmx16G
-XX:InitialRAMPercentage=80
-XX:MaxRAMPercentage=80
-XX:G1HeapRegionSize=32M
-XX:+ExplicitGCInvokesConcurrent
-XX:+ExitOnOutOfMemoryError
-XX:+HeapDumpOnOutOfMemoryError
-XX:-OmitStackTraceInFastThrow
-XX:ReservedCodeCacheSize=512M
-XX:PerMethodRecompilationCutoff=10000
-XX:PerBytecodeRecompilationCutoff=10000
-Djdk.attach.allowAttachSelf=true
-Djdk.nio.maxCachedBufferSize=2000000

# 配置node.properties
# vi /data/trino/etc/node.properties

node.environment=fgedu_trino
node.id=trino-coordinator-01
node.data-dir=/data/trino/data

# 配置log.properties
# vi /data/trino/etc/log.properties

io.trino=INFO

# 配置jmx连接器
# vi /data/trino/etc/catalog/jmx.properties

connector.name=jmx

3.4 配置Worker节点

# 在Worker节点配置
# vi /data/trino/etc/config.properties

coordinator=false
http-server.http.port=8080
discovery.uri=http://192.168.1.51:8080
query.max-memory-per-node=16GB
query.max-memory=100GB
query.max-total-memory-per-node=20GB

# 配置jvm.config
# vi /data/trino/etc/jvm.config

-server
-Xmx32G
-XX:InitialRAMPercentage=80
-XX:MaxRAMPercentage=80
-XX:G1HeapRegionSize=32M
-XX:+ExplicitGCInvokesConcurrent
-XX:+ExitOnOutOfMemoryError
-XX:+HeapDumpOnOutOfMemoryError
-XX:-OmitStackTraceInFastThrow
-XX:ReservedCodeCacheSize=512M
-XX:PerMethodRecompilationCutoff=10000
-XX:PerBytecodeRecompilationCutoff=10000
-Djdk.attach.allowAttachSelf=true
-Djdk.nio.maxCachedBufferSize=2000000

# 配置node.properties
# vi /data/trino/etc/node.properties

node.environment=fgedu_trino
node.id=trino-worker-01
node.data-dir=/data/trino/data

# 配置log.properties
# vi /data/trino/etc/log.properties

io.trino=INFO

3.5 创建systemd服务

# 创建服务文件
# vi /etc/systemd/system/trino.service

[Unit]
Description=Trino Server
After=network.target

[Service]
Type=simple
User=trino
Group=trino
ExecStart=/data/trino/bin/launcher run
ExecStop=/data/trino/bin/launcher stop
LimitNOFILE=65535
LimitNPROC=65535

[Install]
WantedBy=multi-user.target

# 重载systemd
# systemctl daemon-reload

# 启动Trino
# systemctl start trino

# 设置开机自启
# systemctl enable trino

# 检查状态
# systemctl status trino

# 输出示例:
● trino.service – Trino Server
Loaded: loaded (/etc/systemd/system/trino.service; enabled; vendor preset: disabled)
Active: active (running) since Sat 2026-04-04 10:00:00 CST; 10s ago
Main PID: 12345 (java)
Tasks: 50 (limit: 49134)
Memory: 2.0G
CGroup: /system.slice/trino.service
└─12345 /usr/lib/jvm/java-17-openjdk/bin/java …

# 检查端口
# netstat -tlnp | grep 8080

# 输出示例:
tcp6 0 0 :::8080 :::* LISTEN 12345/java

风哥提示:Trino要求Java 17或更高版本。Coordinator和Worker节点配置略有不同,注意区分。

4. Trino参数配置

Trino参数配置是性能优化的关键步骤,直接影响查询性能。更多学习教程公众号风哥教程itpux_com

4.1 Coordinator配置详解

# Coordinator配置
# vi /data/trino/etc/config.properties

# 基本配置
coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
discovery.uri=http://192.168.1.51:8080

# 内存配置
query.max-memory-per-node=8GB
query.max-memory=50GB
query.max-total-memory-per-node=10GB

# 查询配置
query.max-length=1000000
query.max-history=100
query.client.timeout=10m
query.execution-timeout=24h

# 调度配置
scheduler.http-request-max-content-length=32MB
scheduler.min-candidates=10
scheduler.max-splits-per-node=256

# 并发配置
query.max-concurrent-queries=100
query.queue-config-file=/data/trino/etc/queue.json

# 输出配置
query.max-output-positions=1000000

# 重载配置
# systemctl restart trino

4.2 Worker配置详解

# Worker配置
# vi /data/trino/etc/config.properties

# 基本配置
coordinator=false
http-server.http.port=8080
discovery.uri=http://192.168.1.51:8080

# 内存配置
query.max-memory-per-node=16GB
query.max-memory=100GB
query.max-total-memory-per-node=20GB

# 任务配置
task.concurrency=16
task.http-response-threads=50
task.http-request-threads=100

# 输出配置
task.max-partial-aggregation-memory=16MB
task.writer-count=4

# 交换配置
exchange.client-threads=25
exchange.concurrent-request-multiplier=3

# 重载配置
# systemctl restart trino

4.3 内存配置优化

# JVM内存配置
# vi /data/trino/etc/jvm.config

-server
-Xmx32G
-XX:InitialRAMPercentage=80
-XX:MaxRAMPercentage=80
-XX:G1HeapRegionSize=32M
-XX:+ExplicitGCInvokesConcurrent
-XX:+ExitOnOutOfMemoryError
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/var/log/trino
-XX:-OmitStackTraceInFastThrow
-XX:ReservedCodeCacheSize=512M
-XX:PerMethodRecompilationCutoff=10000
-XX:PerBytecodeRecompilationCutoff=10000
-Djdk.attach.allowAttachSelf=true
-Djdk.nio.maxCachedBufferSize=2000000
-Dio.trino.memory.heap-headroom=2GB

# 内存参数说明:
# Xmx:最大堆内存,建议设置为物理内存的70-80%
# G1HeapRegionSize:G1区域大小,建议32M
# HeapDumpPath:OOM时堆转储路径

# 重启服务
# systemctl restart trino

生产环境建议:生产环境建议Coordinator内存设置为物理内存的50%,Worker设置为80%。query.max-memory-per-node不应超过堆内存的50%。

5. 数据源配置

Trino支持多种数据源连接器,本节介绍常用的数据源配置方法。from:www.itpux.com

5.1 Hive连接器配置

# 配置Hive连接器
# vi /data/trino/etc/catalog/hive.properties

connector.name=hive
hive.metastore.uri=thrift://192.168.1.51:9083
hive.allow-drop-table=true
hive.allow-rename-table=true
hive.allow-add-column=true
hive.allow-drop-column=true
hive.max-partitions-per-scan=1000000
hive.max-partitions-for-eager-load=1000

# 配置S3支持
hive.s3.endpoint=http://192.168.1.51:9000
hive.s3.aws-access-key=minioadmin
hive.s3.aws-secret-key=minioadmin
hive.s3.path-style-access=true

# 重启服务
# systemctl restart trino

# 测试连接
$ /data/trino/bin/trino –server http://192.168.1.51:8080

trino> SHOW CATALOGS;
Catalog
———
hive
jmx
system

trino> USE hive.default;
USE

trino> SHOW TABLES;
Table
——-
fgedu_users
fgedu_orders

5.2 MySQL连接器配置

# 配置MySQL连接器
# vi /data/trino/etc/catalog/mysql.properties

connector.name=mysql
connection-url=jdbc:mysql://192.168.1.51:3306
connection-user=root
connection-password=fgedu123
case-insensitive-name-matching=true
case-insensitive-name-matching.cache-ttl=1h

# 连接池配置
mysql.connection-pool.enabled=true
mysql.connection-pool.max-size=20
mysql.connection-pool.min-idle=5

# 重启服务
# systemctl restart trino

# 测试连接
$ /data/trino/bin/trino –server http://192.168.1.51:8080

trino> SHOW SCHEMAS IN mysql;
Schema
——————–
information_schema
fgedudb
mysql
performance_schema

trino> USE mysql.fgedudb;
USE

trino> SHOW TABLES;
Table
————-
fgedu_users
fgedu_orders

5.3 PostgreSQL连接器配置

# 配置PostgreSQL连接器
# vi /data/trino/etc/catalog/postgres.properties

connector.name=postgresql
connection-url=jdbc:postgresql://192.168.1.51:5432/fgedudb
connection-user=postgres
connection-password=fgedu123
case-insensitive-name-matching=true

# 连接池配置
postgresql.connection-pool.enabled=true
postgresql.connection-pool.max-size=20

# 重启服务
# systemctl restart trino

# 测试连接
$ /data/trino/bin/trino –server http://192.168.1.51:8080

trino> SHOW SCHEMAS IN postgres;
Schema
—————-
information_schema
public

5.4 Kafka连接器配置

# 配置Kafka连接器
# vi /data/trino/etc/catalog/kafka.properties

connector.name=kafka
kafka.nodes=192.168.1.51:9092,192.168.1.52:9092,192.168.1.53:9092
kafka.table-description-dir=/data/trino/etc/kafka
kafka.default-schema=default
kafka.hide-internal-columns=false

# 创建表描述文件
# mkdir -p /data/trino/etc/kafka
# vi /data/trino/etc/kafka/fgedu_events.json

{
“tableName”: “fgedu_events”,
“schemaName”: “default”,
“topicName”: “fgedu-events”,
“key”: {
“dataFormat”: “json”,
“fields”: [
{“name”: “event_id”, “type”: “VARCHAR”, “mapping”: “id”}
]
},
“message”: {
“dataFormat”: “json”,
“fields”: [
{“name”: “event_type”, “type”: “VARCHAR”, “mapping”: “type”},
{“name”: “event_time”, “type”: “TIMESTAMP”, “mapping”: “time”},
{“name”: “user_id”, “type”: “BIGINT”, “mapping”: “userId”}
]
}
}

# 重启服务
# systemctl restart trino

# 测试连接
$ /data/trino/bin/trino –server http://192.168.1.51:8080

trino> SELECT * FROM kafka.default.fgedu_events LIMIT 10;

风哥提示:Trino连接器配置灵活,可以同时连接多种数据源实现联邦查询。生产环境建议配置连接池优化性能。

6. 查询优化

Trino查询优化是提升性能的关键,本节介绍常用的优化方法。更多学习教程www.fgedu.net.cn

6.1 查询示例

# 连接Trino
$ /data/trino/bin/trino –server http://192.168.1.51:8080

# 基本查询
trino> SELECT * FROM hive.default.fgedu_users LIMIT 10;

# 输出示例:
id | name | email | created_at
—-+————+————-+————
1 | user001 | user@fgedu | 2026-01-01
2 | user002 | user@fgedu | 2026-01-02

# 联邦查询(跨数据源)
trino> SELECT
-> h.user_id, h.order_amount, m.user_name
-> FROM hive.default.fgedu_orders h
-> JOIN mysql.fgedudb.fgedu_users m ON h.user_id = m.id
-> WHERE h.order_date >= DATE ‘2026-01-01’
-> LIMIT 100;

# 输出示例:
user_id | order_amount | user_name
———+————–+———–
1 | 100.00 | user001
2 | 200.00 | user002

# 聚合查询
trino> SELECT
-> DATE_TRUNC(‘day’, order_date) as day,
-> COUNT(*) as order_count,
-> SUM(order_amount) as total_amount
-> FROM hive.default.fgedu_orders
-> WHERE order_date >= DATE ‘2026-01-01’
-> GROUP BY DATE_TRUNC(‘day’, order_date)
-> ORDER BY day DESC;

# 输出示例:
day | order_count | total_amount
———–+————-+————–
2026-04-03 | 1000 | 100000.00
2026-04-02 | 950 | 95000.00

6.2 查询计划分析

# 查看查询计划
trino> EXPLAIN SELECT * FROM hive.default.fgedu_users WHERE id > 100;

# 输出示例:
Query Plan
——————————————————————————————–
– Output[id] => [id, name, email, created_at]
– RemoteExchange[source] => [id, name, email, created_at]
– ScanFilter[table, hive:fgedu_users] => [id, name, email, created_at]
Filter[id > $operator$cast_bigint(100)]
TableScan[TableHandle {connectorId=’hive’, …}]

# 查看详细查询计划
trino> EXPLAIN ANALYZE SELECT * FROM hive.default.fgedu_users WHERE id > 100;

# 输出示例:
Query Plan
——————————————————————————————–
Trino version: 430
Query: SELECT * FROM hive.default.fgedu_users WHERE id > 100
CPU: 0.5s, Input: 1000 rows (50KB), Output: 500 rows (25KB)
Peak memory: 100MB

– Output[id] => [id, name, email, created_at]
CPU: 0.1s, Output: 500 rows
– RemoteExchange[source] => [id, name, email, created_at]
CPU: 0.2s, Output: 500 rows
– ScanFilter[table, hive:fgedu_users] => [id, name, email, created_at]
CPU: 0.2s, Input: 1000 rows, Output: 500 rows
Filter[id > $operator$cast_bigint(100)]
TableScan[TableHandle {connectorId=’hive’, …}]

6.3 性能优化建议

# 优化建议

1. 分区裁剪
trino> SELECT * FROM hive.default.fgedu_orders
-> WHERE order_date >= DATE ‘2026-01-01’ AND order_date < DATE '2026-02-01'; 2. 列裁剪 trino> SELECT user_id, order_amount FROM hive.default.fgedu_orders;

3. 使用分区表
CREATE TABLE hive.default.fgedu_orders_partitioned (
order_id BIGINT,
user_id BIGINT,
order_amount DECIMAL(10,2),
order_date DATE
) WITH (
partitioned_by = ARRAY[‘order_date’]
);

4. 合理使用JOIN
trino> SELECT /*+ BROADCAST(m) */
-> h.user_id, h.order_amount, m.user_name
-> FROM hive.default.fgedu_orders h
-> JOIN mysql.fgedudb.fgedu_users m ON h.user_id = m.id;

5. 使用物化视图
CREATE MATERIALIZED VIEW hive.default.fgedu_orders_summary AS
SELECT
DATE_TRUNC(‘day’, order_date) as day,
COUNT(*) as order_count,
SUM(order_amount) as total_amount
FROM hive.default.fgedu_orders
GROUP BY DATE_TRUNC(‘day’, order_date);

生产环境建议:生产环境建议使用分区表、列式存储格式(ORC/Parquet)。合理配置query.max-memory避免OOM。

7. 安全配置

Trino提供完善的安全功能,本节介绍常用的安全配置方法。学习交流加群风哥微信: itpux-com

7.1 密码认证配置

# 配置密码认证
# vi /data/trino/etc/config.properties

http-server.authentication.type=PASSWORD

# 创建密码文件
# vi /data/trino/etc/password-authenticator.properties

password-authenticator.name=file
file.password-file=/data/trino/etc/password.db

# 生成密码文件
# java -jar /data/trino/plugin/trino-password-authenticator-430.jar \
–generate-password fgedu fgedu123 >> /data/trino/etc/password.db

# 输出示例:
fgedu:$2a$10$N9qo8uLOickgx2ZMRZoMy…

# 重启服务
# systemctl restart trino

# 测试认证
$ /data/trino/bin/trino –server http://192.168.1.51:8080 –user fgedu –password
Password:

7.2 LDAP认证配置

# 配置LDAP认证
# vi /data/trino/etc/config.properties

http-server.authentication.type=PASSWORD

# 创建LDAP配置
# vi /data/trino/etc/password-authenticator.properties

password-authenticator.name=ldap
ldap.url=ldap://192.168.1.51:389
ldap.bind-dn=cn=admin,dc=fgedu,dc=net
ldap.bind-password=fgedu123
ldap.user-base-dn=ou=users,dc=fgedu,dc=net
ldap.user-search-filter=(uid={0})
ldap.group-base-dn=ou=groups,dc=fgedu,dc=net
ldap.group-search-filter=(member={0})

# 重启服务
# systemctl restart trino

7.3 权限控制配置

# 配置权限控制
# vi /data/trino/etc/config.properties

access-control.enabled=true

# 创建权限配置文件
# vi /data/trino/etc/access-control.properties

access-control.name=file
security.config-file=/data/trino/etc/rules.json
security.refresh-period=1m

# 创建规则文件
# vi /data/trino/etc/rules.json

{
“catalogs”: [
{
“user”: “admin”,
“catalog”: “hive”,
“allow”: “all”
},
{
“user”: “analyst”,
“catalog”: “hive”,
“allow”: “read-only”
},
{
“user”: “analyst”,
“catalog”: “mysql”,
“allow”: “read-only”
}
],
“schemas”: [
{
“user”: “analyst”,
“schema”: “hive.default”,
“owner”: false
}
]
}

# 重启服务
# systemctl restart trino

风哥提示:生产环境建议配置认证和权限控制。可以使用LDAP集成企业用户管理系统。

8. 监控与运维

Trino提供完善的监控和管理功能,本节介绍常用的运维方法。更多学习教程公众号风哥教程itpux_com

8.1 Web UI监控

# 访问Web UI
http://192.168.1.51:8080

# Web UI功能:
1. 查询列表:显示正在运行和最近完成的查询
2. 查询详情:显示查询计划、执行统计
3. 集群状态:显示Worker节点状态
4. 内存使用:显示各节点内存使用情况

# 查看查询详情
点击查询ID查看详细执行计划

# 输出示例:
Query ID: 20260404_100000_00001_xxxxx
State: FINISHED
Query: SELECT * FROM hive.default.fgedu_users LIMIT 10
CPU Time: 0.5s
Wall Time: 1.0s
Peak Memory: 100MB
Input: 1000 rows (50KB)
Output: 10 rows (500B)

8.2 JMX监控

# 查询JMX指标
$ /data/trino/bin/trino –server http://192.168.1.51:8080

trino> SELECT * FROM jmx.current.”io.trino.execution:name=querymanager”;

# 输出示例:
node | name | value
——————————————————-+———–+——-
trino-coordinator-01 | QueryManager | 100

# 查看内存使用
trino> SELECT * FROM jmx.current.”java.lang:type=Memory”;

# 输出示例:
node | heap_memory_used | heap_memory_max
——————————————————-+——————+—————-
trino-coordinator-01 | 8589934592 | 17179869184

# 查看线程数
trino> SELECT * FROM jmx.current.”java.lang:type=Threading”;

# 输出示例:
node | thread_count | peak_thread_count
——————————————————-+————–+——————-
trino-coordinator-01 | 200 | 250

8.3 日志管理

# 查看日志
$ tail -f /data/trino/data/var/log/server.log

# 输出示例:
2026-04-04T10:00:00.000Z INFO main io.trino.server.Server Trino 430
2026-04-04T10:00:00.000Z INFO main io.trino.server.Server JVM: OpenJDK 64-Bit Server VM 17.0.8
2026-04-04T10:00:00.000Z INFO main io.trino.server.Server Memory: 32GB
2026-04-04T10:00:00.000Z INFO main io.trino.server.Server Number of processors: 16

# 查看查询日志
$ grep “Query” /data/trino/data/var/log/server.log | tail -20

# 输出示例:
2026-04-04T10:00:00.000Z INFO query-12345 io.trino.execution.QueryManager Query 20260404_100000_00001_xxxxx finished
2026-04-04T10:00:00.000Z INFO query-12346 io.trino.execution.QueryManager Query 20260404_100001_00002_xxxxx finished

# 配置日志级别
# vi /data/trino/etc/log.properties

io.trino=INFO
io.trino.execution=DEBUG
io.trino.plugin.hive=DEBUG

# 重启服务
# systemctl restart trino

9. 升级与迁移

Trino升级和迁移是运维工作中的重要环节,需要仔细规划和执行。from:www.itpux.com

9.1 版本升级

# 查看当前版本
$ /data/trino/bin/trino –version
Trino 420

# 备份配置
# cp -r /data/trino/etc /backup/trino_etc_$(date +%Y%m%d)

# 下载新版本
# cd /data
# wget https://repo1.maven.org/maven2/io/trino/trino-server/430/trino-server-430.tar.gz

# 停止服务
# systemctl stop trino

# 备份旧版本
# mv /data/trino /data/trino-420

# 解压新版本
# tar -xzf trino-server-430.tar.gz
# mv trino-server-430 trino

# 恢复配置
# cp -r /data/trino-420/etc/* /data/trino/etc/

# 设置权限
# chown -R trino:trino /data/trino

# 启动服务
# systemctl start trino

# 验证版本
$ /data/trino/bin/trino –version
Trino 430

9.2 配置迁移

# 备份配置
# tar -czf trino_backup_$(date +%Y%m%d).tar.gz \
/data/trino/etc \
/data/trino/data

# 迁移到新服务器
# scp trino_backup_*.tar.gz root@newserver:/backup/

# 在新服务器解压
# tar -xzf trino_backup_*.tar.gz -C /

# 修改node.id
# vi /data/trino/etc/node.properties
node.id=trino-coordinator-02

# 启动服务
# systemctl start trino

生产环境建议:升级前必须进行完整备份。Trino配置向后兼容,但建议先在测试环境验证。

10. 生产环境实战案例

本节提供一个完整的生产环境配置案例,帮助读者更好地理解Trino的实际应用。更多学习教程www.fgedu.net.cn

10.1 生产环境完整配置

# Coordinator配置
# vi /data/trino/etc/config.properties

coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
discovery.uri=http://192.168.1.51:8080

query.max-memory-per-node=16GB
query.max-memory=200GB
query.max-total-memory-per-node=20GB
query.max-concurrent-queries=200
query.queue-config-file=/data/trino/etc/queue.json

http-server.authentication.type=PASSWORD
access-control.enabled=true

# Worker配置
# vi /data/trino/etc/config.properties

coordinator=false
http-server.http.port=8080
discovery.uri=http://192.168.1.51:8080

query.max-memory-per-node=32GB
query.max-memory=400GB
query.max-total-memory-per-node=40GB

task.concurrency=32
task.http-response-threads=100
task.http-request-threads=200
exchange.client-threads=50

# JVM配置
# vi /data/trino/etc/jvm.config

-server
-Xmx64G
-XX:InitialRAMPercentage=80
-XX:MaxRAMPercentage=80
-XX:G1HeapRegionSize=32M
-XX:+ExplicitGCInvokesConcurrent
-XX:+ExitOnOutOfMemoryError
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/var/log/trino
-XX:-OmitStackTraceInFastThrow
-XX:ReservedCodeCacheSize=512M
-Djdk.attach.allowAttachSelf=true
-Djdk.nio.maxCachedBufferSize=2000000
-Dio.trino.memory.heap-headroom=4GB

# 数据源配置
# vi /data/trino/etc/catalog/hive.properties

connector.name=hive
hive.metastore.uri=thrift://192.168.1.51:9083
hive.allow-drop-table=true
hive.allow-rename-table=true
hive.max-partitions-per-scan=1000000

10.2 高可用配置

# 多Coordinator配置
# 在多个节点部署Coordinator,使用负载均衡

# Nginx负载均衡配置
# vi /etc/nginx/conf.d/trino.conf

upstream trino_coordinators {
least_conn;
server 192.168.1.51:8080;
server 192.168.1.52:8080 backup;
}

server {
listen 80;
server_name trino.fgedu.net.cn;

location / {
proxy_pass http://trino_coordinators;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}

# 重载Nginx
# nginx -s reload

10.3 性能调优实战

# 执行测试查询
trino> SELECT
-> COUNT(*) as total_rows,
-> SUM(order_amount) as total_amount
-> FROM hive.default.fgedu_orders
-> WHERE order_date >= DATE ‘2026-01-01’;

# 输出示例:
total_rows | total_amount
————+————–
1000000 | 10000000.00

(1 row)
Query 20260404_100000_00001_xxxxx, FINISHED, 3 nodes
Splits: 100 total, 100 done (100.00%)
CPU: 10.0s, Input: 1.0M rows (50MB), Output: 1 row (30B)
Peak memory: 500MB

# 查看查询统计
trino> SELECT
-> query_id,
-> state,
-> cpu_time_millis,
-> wall_time_millis,
-> peak_memory_bytes
-> FROM system.runtime.queries
-> ORDER BY created DESC
-> LIMIT 10;

# 输出示例:
query_id | state | cpu_time_millis | wall_time_millis | peak_memory_bytes
—————————+———-+—————–+——————+——————-
20260404_100000_00001_xxxxx | FINISHED | 10000 | 5000 | 524288000

风哥提示:Trino作为高性能分布式SQL查询引擎,适合大规模数据分析场景。生产环境建议配置多Worker节点,合理分配内存资源。

本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html

联系我们

在线咨询:点击这里给我发消息

微信号:itpux-com

工作日:9:30-18:30,节假日休息