zl程序教程

您现在的位置是:首页 >  其他

当前栏目

clickhouse--玩转行列转换

转换 -- 玩转 ClickHouse 行列
2023-09-11 14:16:24 时间

在进行数据分析时,时常会遇到行转列、列转行的查询需求。
Clickhouse是一款功能强大的分析型数据库,提供了多种方法进行行列的转换,能完美支撑你的查询需求。

行转列

将表中的行转换为列,如学生的成绩表,每行表示表示一科的成绩,
如果想将学生所有科的成绩在一行展示,就需要用到行转列,示意如下:
在这里插入图片描述

普通写法(case when)

SELECT student_name,
	SUM(CASE WHEN subject='语文' THEN score ELSE 0 END) as Chinese,
	SUM(CASE WHEN subject='数学' THEN score ELSE 0 END) as Math,
	SUM(CASE WHEN subject='英语' THEN score ELSE 0 END) as English,
	SUM(CASE WHEN subject='特长加分' THEN score ELSE 0 END) as Special
FROM t_gaokao_score 
GROUP BY student_name;

基于multiIf / if的写法

因为每个学生每科成绩只有一行记录,所以聚合函数使用max、sum的效果是一样的。
因为只有一个判断条件,所以使用multiIf、if效果是一样的。
可以使用rollup子句进行每科成绩的聚合

SELECT student_name,
	max(if(equals(subject,'语文'),score,0)) as Chinese,
	max(if(equals(subject,'数学'),score,0)) as Math,
	sum(multiIf(equals(subject,'英语'),score,0)) as English,
	sum(multiIf(equals(subject,'特长加分'),score,0)) as Special
FROM t_gaokao_score 
GROUP BY student_name;


SELECT student_name,
	sum(if(equals(subject,'语文'),score,0)) as Chinese,
	sum(if(equals(subject,'数学'),score,0)) as Math,
	sum(multiIf(equals(subject,'英语'),score,0)) as English,
	sum(multiIf(equals(subject,'特长加分'),score,0)) as Special,
	sum(score) as Total
FROM t_gaokao_score 
GROUP BY student_name with ROLLUP ;

基于sumIf的写法

-If 是一种组合的聚合函数,其前缀可以是任意一个普通的聚合函数,如argMaxIf、countIf

SELECT student_name,
	sumIf(score,equals(subject,'语文')) as Chinese,
	sumIf(score,equals(subject,'数学')) as Math,
	sumIf(score,equals(subject,'英语')) as English,
	sumIf(score,equals(subject,'特长加分')) as Special,
	sum(score) as Total
FROM t_gaokao_score 
GROUP BY student_name;

测试数据

CREATE TABLE t_gaokao_score  (
  id UInt32 NOT NULL,
  student_name String,
  subject String,
  score Float64
) ENGINE = MergeTree()
ORDER BY id;

INSERT INTO t_gaokao_score VALUES 
(1, '林磊儿', '语文', 148),
(2, '林磊儿', '数学', 150),
(3, '林磊儿', '英语', 147),
(4, '乔英子', '语文', 121),
(5, '乔英子', '数学', 106),
(6, '乔英子', '英语', 146),
(7, '方一凡', '语文', 70),
(8, '方一凡', '数学', 90),
(9, '方一凡', '英语', 59),
(10, '方一凡', '特长加分', 200),
(11, '陈哈哈', '语文', 109),
(12, '陈哈哈', '数学', 92),
(13, '陈哈哈', '英语', 80);

列转行

将表中的列转换为行,示意如下:
在这里插入图片描述

普通写法(uion all)

SELECT id,a AS newCol FROM test UNION ALL
SELECT id,b AS newCol FROM test UNION ALL
SELECT id,c AS newCol FROM test

基于arrayJoin的写法

SELECT 
    id, 
    arrayJoin(arrayConcat(groupArray(a), groupArray(b), groupArray(c))) AS newCol
FROM test
GROUP BY id
order by newCol

测试数据

CREATE TABLE test(
  id UInt64,
  a String,
  b String,
  c String
)ENGINE = MergeTree()
ORDER BY id

INSERT INTO TABLE test 
WITH(
  SELECT ['A','B','C']
)AS dict
SELECT number,concat(dict[1],toString(number)),concat(dict[2],toString(number)),concat(dict[3],toString(number)) FROM numbers(100000)