内容简介:本文主要介绍MariaDB数据仓库与分析的方法与实践,包括数据仓库的基本概念、数据仓库的架构、数据分析的基本概念等内容。通过实际案例讲解数据仓库构建、ETL过程实现和数据分析,帮助读者掌握MariaDB数据仓库与分析的技能。风哥教程参考MariaDB官方文档和相关数据仓库最佳实践。
Part01-基础概念与理论知识
1.1 数据仓库的基本概念
数据仓库(Data Warehouse)是一个面向主题的、集成的、非易失的、随时间变化的数据集合,用于支持管理决策。
数据仓库的主要特点:
- 面向主题:数据仓库围绕业务主题组织数据,如销售、库存、客户等
- 集成性:数据仓库集成来自多个数据源的数据
- 非易失性:数据仓库中的数据一旦加载,很少修改
- 随时间变化:数据仓库中的数据会随着时间的推移而变化
1.2 数据仓库的架构
数据仓库的典型架构包括:
- 数据源:原始数据来源,如业务系统、日志文件等
- ETL过程:提取(Extract)、转换(Transform)、加载(Load)过程,将数据从数据源提取,转换为适合分析的格式,加载到数据仓库中
- 数据仓库:存储整合后的数据
- OLAP工具:联机分析处理工具,用于数据分析
- 前端工具:用于数据可视化和报表生成
1.3 数据分析的基本概念
数据分析(Data Analysis)是指使用统计方法和工具对数据进行分析,以提取有价值的信息和 insights。
数据分析的主要方法:
- 描述性分析:描述数据的基本特征,如均值、中位数、标准差等
- 诊断性分析:分析数据的原因和关系
- 预测性分析:预测未来的趋势和结果
- 规范性分析:提供最优决策建议
更多视频教程www.fgedu.net.cn
Part02-生产环境规划与建议
2.1 数据仓库规划
数据仓库规划建议:
- 业务需求分析:分析业务需求,确定数据仓库的主题和范围
- 数据源分析:分析数据源的类型、结构和质量
- 数据模型设计:设计数据仓库的数据模型,如星型模型、雪花模型等
- ETL流程设计:设计ETL流程,确保数据的正确提取、转换和加载
- 存储规划:规划数据仓库的存储结构和容量
2.2 数据模型设计
数据模型设计建议:
- 星型模型:以事实表为中心,周围环绕维度表,适合简单查询
- 雪花模型:星型模型的扩展,维度表可以进一步细分为子维度表,适合复杂查询
- 事实表设计:包含度量值和外键,指向维度表
- 维度表设计:包含描述性属性,用于过滤和分组
2.3 性能优化建议
性能优化建议:
- 分区表:使用分区表提高查询性能
- 索引优化:为常用查询创建合适的索引
- 物化视图:使用物化视图预计算聚合数据
- 并行处理:使用并行查询提高处理速度
- 缓存策略:使用查询缓存或应用层缓存
学习交流加群风哥微信: itpux-com
Part03-生产环境项目实施方案
3.1 数据仓库构建
更多学习教程公众号风哥教程itpux_com
# 数据仓库构建
MariaDB [(none)]> # 1. 创建数据仓库数据库
CREATE DATABASE fgedu_data_warehouse;
# 2. 创建维度表
# 时间维度表
CREATE TABLE fgedu_dim_time (
time_id INT PRIMARY KEY,
date DATE,
year INT,
month INT,
day INT,
quarter INT,
week INT
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 产品维度表
CREATE TABLE fgedu_dim_product (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50),
brand VARCHAR(50)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 客户维度表
CREATE TABLE fgedu_dim_customer (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
city VARCHAR(50),
country VARCHAR(50)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 3. 创建事实表
CREATE TABLE fgedu_fact_sales (
sales_id INT PRIMARY KEY,
time_id INT,
product_id INT,
customer_id INT,
quantity INT,
amount DECIMAL(10,2),
FOREIGN KEY (time_id) REFERENCES fgedu_dim_time(time_id),
FOREIGN KEY (product_id) REFERENCES fgedu_dim_product(product_id),
FOREIGN KEY (customer_id) REFERENCES fgedu_dim_customer(customer_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 4. 创建索引
CREATE INDEX idx_time_id ON fgedu_fact_sales(time_id);
CREATE INDEX idx_product_id ON fgedu_fact_sales(product_id);
CREATE INDEX idx_customer_id ON fgedu_fact_sales(customer_id);
MariaDB [(none)]> # 1. 创建数据仓库数据库
CREATE DATABASE fgedu_data_warehouse;
# 2. 创建维度表
# 时间维度表
CREATE TABLE fgedu_dim_time (
time_id INT PRIMARY KEY,
date DATE,
year INT,
month INT,
day INT,
quarter INT,
week INT
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 产品维度表
CREATE TABLE fgedu_dim_product (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50),
brand VARCHAR(50)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 客户维度表
CREATE TABLE fgedu_dim_customer (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
city VARCHAR(50),
country VARCHAR(50)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 3. 创建事实表
CREATE TABLE fgedu_fact_sales (
sales_id INT PRIMARY KEY,
time_id INT,
product_id INT,
customer_id INT,
quantity INT,
amount DECIMAL(10,2),
FOREIGN KEY (time_id) REFERENCES fgedu_dim_time(time_id),
FOREIGN KEY (product_id) REFERENCES fgedu_dim_product(product_id),
FOREIGN KEY (customer_id) REFERENCES fgedu_dim_customer(customer_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 4. 创建索引
CREATE INDEX idx_time_id ON fgedu_fact_sales(time_id);
CREATE INDEX idx_product_id ON fgedu_fact_sales(product_id);
CREATE INDEX idx_customer_id ON fgedu_fact_sales(customer_id);
3.2 ETL过程实现
# ETL过程实现
MariaDB [(none)]> # 1. 提取数据
# 从业务系统提取数据
SELECT * FROM fgedu_orders;
SELECT * FROM fgedu_order_items;
SELECT * FROM fgedu_products;
SELECT * FROM fgedu_customers;
# 2. 转换数据
# 转换时间数据
INSERT INTO fgedu_dim_time(time_id, date, year, month, day, quarter, week)
SELECT
DATE_FORMAT(order_date, ‘%Y%m%d’) AS time_id,
order_date AS date,
YEAR(order_date) AS year,
MONTH(order_date) AS month,
DAY(order_date) AS day,
QUARTER(order_date) AS quarter,
WEEK(order_date) AS week
FROM fgedu_orders
GROUP BY order_date;
# 转换产品数据
INSERT INTO fgedu_dim_product(product_id, product_name, category, brand)
SELECT id, name, category, brand FROM fgedu_products;
# 转换客户数据
INSERT INTO fgedu_dim_customer(customer_id, customer_name, city, country)
SELECT id, name, city, country FROM fgedu_customers;
# 3. 加载数据到事实表
INSERT INTO fgedu_fact_sales(sales_id, time_id, product_id, customer_id, quantity, amount)
SELECT
o.id AS sales_id,
DATE_FORMAT(o.order_date, ‘%Y%m%d’) AS time_id,
oi.product_id,
o.customer_id,
oi.quantity,
oi.quantity * p.price AS amount
FROM fgedu_orders o
JOIN fgedu_order_items oi ON o.id = oi.order_id
JOIN fgedu_products p ON oi.product_id = p.id;
MariaDB [(none)]> # 1. 提取数据
# 从业务系统提取数据
SELECT * FROM fgedu_orders;
SELECT * FROM fgedu_order_items;
SELECT * FROM fgedu_products;
SELECT * FROM fgedu_customers;
# 2. 转换数据
# 转换时间数据
INSERT INTO fgedu_dim_time(time_id, date, year, month, day, quarter, week)
SELECT
DATE_FORMAT(order_date, ‘%Y%m%d’) AS time_id,
order_date AS date,
YEAR(order_date) AS year,
MONTH(order_date) AS month,
DAY(order_date) AS day,
QUARTER(order_date) AS quarter,
WEEK(order_date) AS week
FROM fgedu_orders
GROUP BY order_date;
# 转换产品数据
INSERT INTO fgedu_dim_product(product_id, product_name, category, brand)
SELECT id, name, category, brand FROM fgedu_products;
# 转换客户数据
INSERT INTO fgedu_dim_customer(customer_id, customer_name, city, country)
SELECT id, name, city, country FROM fgedu_customers;
# 3. 加载数据到事实表
INSERT INTO fgedu_fact_sales(sales_id, time_id, product_id, customer_id, quantity, amount)
SELECT
o.id AS sales_id,
DATE_FORMAT(o.order_date, ‘%Y%m%d’) AS time_id,
oi.product_id,
o.customer_id,
oi.quantity,
oi.quantity * p.price AS amount
FROM fgedu_orders o
JOIN fgedu_order_items oi ON o.id = oi.order_id
JOIN fgedu_products p ON oi.product_id = p.id;
3.3 数据分析实现
# 数据分析实现
MariaDB [(none)]> # 1. 销售汇总分析
SELECT
d.year,
d.month,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_time d ON f.time_id = d.time_id
GROUP BY d.year, d.month
ORDER BY d.year, d.month;
# 2. 产品销售分析
SELECT
p.product_name,
SUM(f.quantity) AS total_quantity,
SUM(f.amount) AS total_amount
FROM fgedu_fact_sales f
JOIN fgedu_dim_product p ON f.product_id = p.product_id
GROUP BY p.product_name
ORDER BY total_amount DESC;
# 3. 客户销售分析
SELECT
c.customer_name,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_customer c ON f.customer_id = c.customer_id
GROUP BY c.customer_name
ORDER BY total_sales DESC;
# 4. 地区销售分析
SELECT
c.city,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_customer c ON f.customer_id = c.customer_id
GROUP BY c.city
ORDER BY total_sales DESC;
MariaDB [(none)]> # 1. 销售汇总分析
SELECT
d.year,
d.month,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_time d ON f.time_id = d.time_id
GROUP BY d.year, d.month
ORDER BY d.year, d.month;
# 2. 产品销售分析
SELECT
p.product_name,
SUM(f.quantity) AS total_quantity,
SUM(f.amount) AS total_amount
FROM fgedu_fact_sales f
JOIN fgedu_dim_product p ON f.product_id = p.product_id
GROUP BY p.product_name
ORDER BY total_amount DESC;
# 3. 客户销售分析
SELECT
c.customer_name,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_customer c ON f.customer_id = c.customer_id
GROUP BY c.customer_name
ORDER BY total_sales DESC;
# 4. 地区销售分析
SELECT
c.city,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_customer c ON f.customer_id = c.customer_id
GROUP BY c.city
ORDER BY total_sales DESC;
学习交流加群风哥QQ113257174
Part04-生产案例与实战讲解
4.1 数据仓库构建案例
场景描述:构建一个销售数据仓库,用于分析销售情况。
# 数据仓库构建
MariaDB [(none)]> # 1. 创建数据仓库数据库
CREATE DATABASE fgedu_data_warehouse;
USE fgedu_data_warehouse;
# 2. 创建维度表
# 时间维度表
CREATE TABLE fgedu_dim_time (
time_id INT PRIMARY KEY,
date DATE,
year INT,
month INT,
day INT,
quarter INT,
week INT
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 产品维度表
CREATE TABLE fgedu_dim_product (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50),
brand VARCHAR(50)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 客户维度表
CREATE TABLE fgedu_dim_customer (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
city VARCHAR(50),
country VARCHAR(50)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 3. 创建事实表
CREATE TABLE fgedu_fact_sales (
sales_id INT PRIMARY KEY,
time_id INT,
product_id INT,
customer_id INT,
quantity INT,
amount DECIMAL(10,2),
FOREIGN KEY (time_id) REFERENCES fgedu_dim_time(time_id),
FOREIGN KEY (product_id) REFERENCES fgedu_dim_product(product_id),
FOREIGN KEY (customer_id) REFERENCES fgedu_dim_customer(customer_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 4. 创建索引
CREATE INDEX idx_time_id ON fgedu_fact_sales(time_id);
CREATE INDEX idx_product_id ON fgedu_fact_sales(product_id);
CREATE INDEX idx_customer_id ON fgedu_fact_sales(customer_id);
MariaDB [(none)]> # 1. 创建数据仓库数据库
CREATE DATABASE fgedu_data_warehouse;
USE fgedu_data_warehouse;
# 2. 创建维度表
# 时间维度表
CREATE TABLE fgedu_dim_time (
time_id INT PRIMARY KEY,
date DATE,
year INT,
month INT,
day INT,
quarter INT,
week INT
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 产品维度表
CREATE TABLE fgedu_dim_product (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50),
brand VARCHAR(50)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 客户维度表
CREATE TABLE fgedu_dim_customer (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
city VARCHAR(50),
country VARCHAR(50)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 3. 创建事实表
CREATE TABLE fgedu_fact_sales (
sales_id INT PRIMARY KEY,
time_id INT,
product_id INT,
customer_id INT,
quantity INT,
amount DECIMAL(10,2),
FOREIGN KEY (time_id) REFERENCES fgedu_dim_time(time_id),
FOREIGN KEY (product_id) REFERENCES fgedu_dim_product(product_id),
FOREIGN KEY (customer_id) REFERENCES fgedu_dim_customer(customer_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
# 4. 创建索引
CREATE INDEX idx_time_id ON fgedu_fact_sales(time_id);
CREATE INDEX idx_product_id ON fgedu_fact_sales(product_id);
CREATE INDEX idx_customer_id ON fgedu_fact_sales(customer_id);
执行结果:
# 创建数据库
Query OK, 1 row affected (0.00 sec)
# 创建维度表
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.01 sec)
# 创建事实表
Query OK, 0 rows affected (0.01 sec)
# 创建索引
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.01 sec)
Query OK, 1 row affected (0.00 sec)
# 创建维度表
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.01 sec)
# 创建事实表
Query OK, 0 rows affected (0.01 sec)
# 创建索引
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.01 sec)
4.2 ETL过程实现案例
场景描述:实现ETL过程,将业务数据加载到数据仓库中。
# ETL过程实现
MariaDB [(none)]> # 1. 提取数据
# 从业务系统提取数据
USE fgedudb;
SELECT * FROM fgedu_orders LIMIT 10;
SELECT * FROM fgedu_order_items LIMIT 10;
SELECT * FROM fgedu_products LIMIT 10;
SELECT * FROM fgedu_customers LIMIT 10;
# 2. 转换数据
USE fgedu_data_warehouse;
# 转换时间数据
INSERT INTO fgedu_dim_time(time_id, date, year, month, day, quarter, week)
SELECT
DATE_FORMAT(order_date, ‘%Y%m%d’) AS time_id,
order_date AS date,
YEAR(order_date) AS year,
MONTH(order_date) AS month,
DAY(order_date) AS day,
QUARTER(order_date) AS quarter,
WEEK(order_date) AS week
FROM fgedudb.fgedu_orders
GROUP BY order_date;
# 转换产品数据
INSERT INTO fgedu_dim_product(product_id, product_name, category, brand)
SELECT id, name, category, brand FROM fgedudb.fgedu_products;
# 转换客户数据
INSERT INTO fgedu_dim_customer(customer_id, customer_name, city, country)
SELECT id, name, city, country FROM fgedudb.fgedu_customers;
# 3. 加载数据到事实表
INSERT INTO fgedu_fact_sales(sales_id, time_id, product_id, customer_id, quantity, amount)
SELECT
o.id AS sales_id,
DATE_FORMAT(o.order_date, ‘%Y%m%d’) AS time_id,
oi.product_id,
o.customer_id,
oi.quantity,
oi.quantity * p.price AS amount
FROM fgedudb.fgedu_orders o
JOIN fgedudb.fgedu_order_items oi ON o.id = oi.order_id
JOIN fgedudb.fgedu_products p ON oi.product_id = p.id;
MariaDB [(none)]> # 1. 提取数据
# 从业务系统提取数据
USE fgedudb;
SELECT * FROM fgedu_orders LIMIT 10;
SELECT * FROM fgedu_order_items LIMIT 10;
SELECT * FROM fgedu_products LIMIT 10;
SELECT * FROM fgedu_customers LIMIT 10;
# 2. 转换数据
USE fgedu_data_warehouse;
# 转换时间数据
INSERT INTO fgedu_dim_time(time_id, date, year, month, day, quarter, week)
SELECT
DATE_FORMAT(order_date, ‘%Y%m%d’) AS time_id,
order_date AS date,
YEAR(order_date) AS year,
MONTH(order_date) AS month,
DAY(order_date) AS day,
QUARTER(order_date) AS quarter,
WEEK(order_date) AS week
FROM fgedudb.fgedu_orders
GROUP BY order_date;
# 转换产品数据
INSERT INTO fgedu_dim_product(product_id, product_name, category, brand)
SELECT id, name, category, brand FROM fgedudb.fgedu_products;
# 转换客户数据
INSERT INTO fgedu_dim_customer(customer_id, customer_name, city, country)
SELECT id, name, city, country FROM fgedudb.fgedu_customers;
# 3. 加载数据到事实表
INSERT INTO fgedu_fact_sales(sales_id, time_id, product_id, customer_id, quantity, amount)
SELECT
o.id AS sales_id,
DATE_FORMAT(o.order_date, ‘%Y%m%d’) AS time_id,
oi.product_id,
o.customer_id,
oi.quantity,
oi.quantity * p.price AS amount
FROM fgedudb.fgedu_orders o
JOIN fgedudb.fgedu_order_items oi ON o.id = oi.order_id
JOIN fgedudb.fgedu_products p ON oi.product_id = p.id;
执行结果:
# 转换时间数据
Query OK, 100 rows affected (0.01 sec)
Records: 100 Duplicates: 0 Warnings: 0
# 转换产品数据
Query OK, 50 rows affected (0.00 sec)
Records: 50 Duplicates: 0 Warnings: 0
# 转换客户数据
Query OK, 20 rows affected (0.00 sec)
Records: 20 Duplicates: 0 Warnings: 0
# 加载数据到事实表
Query OK, 200 rows affected (0.01 sec)
Records: 200 Duplicates: 0 Warnings: 0
Query OK, 100 rows affected (0.01 sec)
Records: 100 Duplicates: 0 Warnings: 0
# 转换产品数据
Query OK, 50 rows affected (0.00 sec)
Records: 50 Duplicates: 0 Warnings: 0
# 转换客户数据
Query OK, 20 rows affected (0.00 sec)
Records: 20 Duplicates: 0 Warnings: 0
# 加载数据到事实表
Query OK, 200 rows affected (0.01 sec)
Records: 200 Duplicates: 0 Warnings: 0
4.3 数据分析案例
场景描述:使用数据仓库进行销售数据分析。
# 数据分析
MariaDB [(none)]> USE fgedu_data_warehouse;
# 1. 销售汇总分析
SELECT
d.year,
d.month,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_time d ON f.time_id = d.time_id
GROUP BY d.year, d.month
ORDER BY d.year, d.month;
# 2. 产品销售分析
SELECT
p.product_name,
SUM(f.quantity) AS total_quantity,
SUM(f.amount) AS total_amount
FROM fgedu_fact_sales f
JOIN fgedu_dim_product p ON f.product_id = p.product_id
GROUP BY p.product_name
ORDER BY total_amount DESC LIMIT 10;
# 3. 客户销售分析
SELECT
c.customer_name,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_customer c ON f.customer_id = c.customer_id
GROUP BY c.customer_name
ORDER BY total_sales DESC LIMIT 10;
# 4. 地区销售分析
SELECT
c.city,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_customer c ON f.customer_id = c.customer_id
GROUP BY c.city
ORDER BY total_sales DESC LIMIT 10;
MariaDB [(none)]> USE fgedu_data_warehouse;
# 1. 销售汇总分析
SELECT
d.year,
d.month,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_time d ON f.time_id = d.time_id
GROUP BY d.year, d.month
ORDER BY d.year, d.month;
# 2. 产品销售分析
SELECT
p.product_name,
SUM(f.quantity) AS total_quantity,
SUM(f.amount) AS total_amount
FROM fgedu_fact_sales f
JOIN fgedu_dim_product p ON f.product_id = p.product_id
GROUP BY p.product_name
ORDER BY total_amount DESC LIMIT 10;
# 3. 客户销售分析
SELECT
c.customer_name,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_customer c ON f.customer_id = c.customer_id
GROUP BY c.customer_name
ORDER BY total_sales DESC LIMIT 10;
# 4. 地区销售分析
SELECT
c.city,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_customer c ON f.customer_id = c.customer_id
GROUP BY c.city
ORDER BY total_sales DESC LIMIT 10;
执行结果:
# 销售汇总分析
+——+——-+————-+
| year | month | total_sales |
+——+——-+————-+
| 2023 | 1 | 15000.00 |
| 2023 | 2 | 18000.00 |
| 2023 | 3 | 20000.00 |
| 2023 | 4 | 16000.00 |
+——+——-+————-+
# 产品销售分析
+————-+—————+————-+
| product_name | total_quantity | total_amount |
+————-+—————+————-+
| Smartphone | 50 | 5000.00 |
| Laptop | 30 | 3000.00 |
| Tablet | 20 | 2000.00 |
+————-+—————+————-+
# 客户销售分析
+—————+————-+
| customer_name | total_sales |
+—————+————-+
| Customer A | 8000.00 |
| Customer B | 6000.00 |
| Customer C | 4000.00 |
+—————+————-+
# 地区销售分析
+———-+————-+
| city | total_sales |
+———-+————-+
| Beijing | 12000.00 |
| Shanghai | 10000.00 |
| Guangzhou | 8000.00 |
+———-+————-+
+——+——-+————-+
| year | month | total_sales |
+——+——-+————-+
| 2023 | 1 | 15000.00 |
| 2023 | 2 | 18000.00 |
| 2023 | 3 | 20000.00 |
| 2023 | 4 | 16000.00 |
+——+——-+————-+
# 产品销售分析
+————-+—————+————-+
| product_name | total_quantity | total_amount |
+————-+—————+————-+
| Smartphone | 50 | 5000.00 |
| Laptop | 30 | 3000.00 |
| Tablet | 20 | 2000.00 |
+————-+—————+————-+
# 客户销售分析
+—————+————-+
| customer_name | total_sales |
+—————+————-+
| Customer A | 8000.00 |
| Customer B | 6000.00 |
| Customer C | 4000.00 |
+—————+————-+
# 地区销售分析
+———-+————-+
| city | total_sales |
+———-+————-+
| Beijing | 12000.00 |
| Shanghai | 10000.00 |
| Guangzhou | 8000.00 |
+———-+————-+
风哥提示:安全开发是防止SQL注入的第一道防线
Part05-风哥经验总结与分享
5.1 数据仓库最佳实践
风哥提示:在构建数据仓库时,应遵循最佳实践,确保数据仓库的质量和性能。
- 明确业务需求:根据业务需求确定数据仓库的主题和范围
- 设计合理的数据模型:选择合适的数据模型,如星型模型或雪花模型
- 确保数据质量:在ETL过程中进行数据清洗和验证
- 优化ETL流程:确保ETL流程的效率和可靠性
- 定期维护数据仓库:定期更新数据,优化存储结构
5.2 数据分析技巧
- 使用合适的分析方法:根据业务需求选择合适的分析方法
- 利用聚合函数:使用SUM、AVG、COUNT等聚合函数进行数据汇总
- 使用分组和排序:使用GROUP BY和ORDER BY进行数据分组和排序
- 使用窗口函数:使用窗口函数进行高级分析,如排名、移动平均等
- 可视化数据:使用图表和报表可视化分析结果
5.3 常见问题与解决方案
- 数据质量问题:在ETL过程中进行数据清洗和验证,确保数据质量
- 性能问题:使用分区表、索引优化、物化视图等方法提高性能
- 数据更新问题:设计合理的增量更新策略,减少数据加载时间
- 存储问题:使用分区表、压缩等方法减少存储空间
- 分析复杂度问题:使用OLAP工具和前端工具简化分析过程
# 数据仓库与分析示例
— 创建数据仓库数据库
CREATE DATABASE fgedu_data_warehouse;
— 创建维度表
CREATE TABLE fgedu_dim_time (
time_id INT PRIMARY KEY,
date DATE,
year INT,
month INT,
day INT,
quarter INT,
week INT
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
— 创建事实表
CREATE TABLE fgedu_fact_sales (
sales_id INT PRIMARY KEY,
time_id INT,
product_id INT,
customer_id INT,
quantity INT,
amount DECIMAL(10,2),
FOREIGN KEY (time_id) REFERENCES fgedu_dim_time(time_id),
FOREIGN KEY (product_id) REFERENCES fgedu_dim_product(product_id),
FOREIGN KEY (customer_id) REFERENCES fgedu_dim_customer(customer_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
— 数据分析
SELECT
d.year,
d.month,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_time d ON f.time_id = d.time_id
GROUP BY d.year, d.month
ORDER BY d.year, d.month;
— 创建数据仓库数据库
CREATE DATABASE fgedu_data_warehouse;
— 创建维度表
CREATE TABLE fgedu_dim_time (
time_id INT PRIMARY KEY,
date DATE,
year INT,
month INT,
day INT,
quarter INT,
week INT
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
— 创建事实表
CREATE TABLE fgedu_fact_sales (
sales_id INT PRIMARY KEY,
time_id INT,
product_id INT,
customer_id INT,
quantity INT,
amount DECIMAL(10,2),
FOREIGN KEY (time_id) REFERENCES fgedu_dim_time(time_id),
FOREIGN KEY (product_id) REFERENCES fgedu_dim_product(product_id),
FOREIGN KEY (customer_id) REFERENCES fgedu_dim_customer(customer_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
— 数据分析
SELECT
d.year,
d.month,
SUM(f.amount) AS total_sales
FROM fgedu_fact_sales f
JOIN fgedu_dim_time d ON f.time_id = d.time_id
GROUP BY d.year, d.month
ORDER BY d.year, d.month;
通过以上措施,可以有效构建MariaDB数据仓库并进行数据分析,为业务决策提供支持。
from MariaDB视频:www.itpux.com
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
