hive学习(二) hive操作
学习 操作 hive
2023-09-14 09:07:47 时间
1.创建库
create database test;
2.删除库
drop database test;
3.建表
完整ddl建表语法规则
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive 0.14.0 and later) [(col_name data_type [COMMENT col_comment], ... [constraint_specification])] [COMMENT table_comment] [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)] [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS] [SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)] ON ((col_value, col_value, ...), (col_value, col_value, ...), ...) [STORED AS DIRECTORIES] [ [ROW FORMAT row_format] [STORED AS file_format] | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later) ] [LOCATION hdfs_path] [TBLPROPERTIES (property_name=property_value, ...)] -- (Note: Available in Hive 0.6.0 and later) [AS select_statement]; -- (Note: Available in Hive 0.5.0 and later; not supported for external tables)
4.hive的数据类型
: primitive_type | array_type | map_type | struct_type :primitive_type |TINYINT | SMALLINT | INT | BIGINT | BOOLEAN | FLOAT | DOUBLE | STRING
5.建表例子:
如下 文件data
对应字段: id,姓名,爱好,住址
1,小明1,lol-book-movie,beijing:changping-shanghai:pudong 2,小明2,lol-book-movie,beijing:changping-shanghai:pudong 3,小明3,lol-book-movie,beijing:changping-shanghai:pudong 4,小明4,lol-book-movie,beijing:changping-shanghai:pudong 5,小明5,lol-movie,beijing:changping-shanghai:pudong 6,小明6,lol-book-movie,beijing:changping-shanghai:pudong 7,小明7,lol-book,beijing:changping-shanghai:pudong 8,小明8,lol-book,beijing:changping-shanghai:pudong 9,小明9,lol-book-movie,beijing:changping-shanghai:pudong id int , 姓名 string,爱好 数组类型,住址 map类型
5.1创建语句分析
create table psn ( id int, name string, hobby array<string>, address map<string,string> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '-' MAP KEYS TERMINATED BY ':' LINES TERMINATED BY '\n';
通过hive建表时不但定义了字段的类型,还定义了字段与字段之间的分隔符,数据类型之间的分隔符,每行数据之间的分隔符。
5.2执行创建
hive> create table psn ( > id int, > name string, > hobby array<string>, > address map<string,string> > ) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY ',' > COLLECTION ITEMS TERMINATED BY '-' > MAP KEYS TERMINATED BY ':' > LINES TERMINATED BY '\n'; OK Time taken: 0.758 seconds
5.3查看结构
hive> desc psn; OK id int name string hobby array<string> address map<string,string> Time taken: 0.267 seconds, Fetched: 4 row(s) #详细查看 hive> desc formatted psn;
5.4插入数据
说插入有点不合适,应该是加载文件(data.txt)中的数据
5.5语法:
LOAD DATA [LOCAL] INPATH 'path' INTO TABLE psn;
5.6执行加载
hive> load data local inpath '/root/data' into table psn; Loading data to table default.psn OK Time taken: 3.591 seconds
发现使用这种方法插入数据简直快的飞起啊。
5.7执行查询
hive> select * from psn; OK 1 小明1 ["lol","book","movie"] {"beijing":"changping","shanghai":"pudong"} 2 小明2 ["lol","book","movie"] {"beijing":"changping","shanghai":"pudong"} 3 小明3 ["lol","book","movie"] {"beijing":"changping","shanghai":"pudong"} 4 小明4 ["lol","book","movie"] {"beijing":"changping","shanghai":"pudong"} 5 小明5 ["lol","movie"] {"beijing":"changping","shanghai":"pudong"} 6 小明6 ["lol","book","movie"] {"beijing":"changping","shanghai":"pudong"} 7 小明7 ["lol","book"] {"beijing":"changping","shanghai":"pudong"} 8 小明8 ["lol","book"] {"beijing":"changping","shanghai":"pudong"} 9 小明9 ["lol","book","movie"] {"beijing":"changping","shanghai":"pudong"} Time taken: 0.235 seconds, Fetched: 9 row(s)
5.8注意,
分隔符的格式如果不匹配它不会报错,而是插入null的空数据。
6.hive内部表和外部表
6.1hive内部表语法
CREATE TABLE [IF NOT EXISTS] table_name
删除表时,元数据与表结构都会被删除
6.2hive外部表语法
CREATE EXTERNAL TABLE [IF NOT EXISTS] table_name LOCATION hdfs_path
删除表时,只删除表结构,不删除元数据。
6.3创建外部表
create external table psn ( id int, name string, hobby array<string>, address map<string,string> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '-' MAP KEYS TERMINATED BY ':' LINES TERMINATED BY '\n' location '/usr/';
创建前 hdfs下的目录 没有 usr目录
6.4执行创建外部表
hive> create external table psn ( > id int, > name string, > hobby array<string>, > address map<string,string> > ) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY ',' > COLLECTION ITEMS TERMINATED BY '-' > MAP KEYS TERMINATED BY ':' > LINES TERMINATED BY '\n' > location '/usr/'; OK Time taken: 0.188 seconds
再次查看网页hdfs下的目录
外部表执行删除只会删除表结构,不删除元数据。
刷新网页usr目录还在
6.5内部表和外部表的区别:
1.创建时需要制定目录
2.外部表执行删除只会删除表结构,不删除元数据。内部表回删除表结构和删除元数据。
相关文章
- 数据库学习之多表操作(三)
- 【mongodb系统学习之八】mongodb shell常用操作
- 学习索引结构的一些案例——Jeff Dean在SystemML会议上发布的论文(中)
- 机器学习(三)混淆矩阵
- BGP网络学习总结
- Scala - 快速学习09 - 函数式编程:一些操作
- Android学习---通过内容提供者(ContentProvider)操作另外一个应用私有数据库的内容
- python操作docx学习资料
- 【学习总结】SQL的学习-2-sql操作
- 机器学习笔记 - 使用CNN和LSTM为图像生成文字描述
- 数学建模学习(24): 排队论模型完整详细讲解,数学与案例结合,lingo软件搭配,数学不好也能学会!
- 解惑Python模块学习,该如何着手操作...
- 不要逼自己学习 Excel 了,用 Python 实现 excel 的14个常用操作!
- Vue学习之--------插槽【默认插槽、具名插槽、作用域插槽】(2022/8/30)
- git的使用学习(六)git的标签管理
- 一脸懵逼学习oracle(图形化界面操作---》PLSQL图形化界面)
- 机器学习——最优化问题:拉格朗日乘子法、KKT条件以及对偶问题
- 二叉树学习——简单入门题
- 信号完整性(SI)电源完整性(PI)学习笔记(七)电感的物理基础(二)
- Python学习笔记之小派读诗
- 【黑马程序员新版Linux学习笔记】Linux系统实用操作命令——操作演示