zl程序教程

您现在的位置是:首页 >  其他

当前栏目

HDFS文件读取

文件HDFS 读取
2023-09-27 14:25:39 时间

实验环境
Linux Ubuntu 16.04
前提条件:
1)Java 运行环境部署完成
2)Hadoop 的单点部署完成
 

实验内容
在上述前提条件下,学习HDFS文件读取的相关操作。

实验步骤

1.点击桌面的"命令行终端",打开新的命令行窗口

2.启动HDFS

启动HDFS,在命令行窗口输入下面的命令:

/apps/hadoop/sbin/start-dfs.sh

运行后显示如下,根据日志显示,分别启动了NameNode、DataNode、Secondary NameNode:

dolphin@tools:~$ /apps/hadoop/sbin/start-dfs.sh 
Starting namenodes on [localhost]
localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Starting datanodes
Starting secondary namenodes [tools.hadoop.fs.init]
tools.hadoop.fs.init: Warning: Permanently added 'tools.hadoop.fs.init,172.22.0.2' (ECDSA) to the list of known hosts.

3.查看HDFS相关进程

在命令行窗口输入下面的命令:

jps

运行后显示如下,表明NameNode、DataNode、Secondary NameNode已经成功启动

dolphin@tools:~$ jps
484 DataNode
663 SecondaryNameNode
375 NameNode
861 Jps

4.准备要上传的文件

在命令行窗口输入下面的命令:

ls Desktop/logs/

运行后显示如下,在/home/dolphin/Desktop/logs目录下有一个log文件

dolphin@tools:~$ ls Desktop/logs/
access.1556972279.log

5.查看文件内容

在命令行窗口输入下面的命令:

cat Desktop/logs/access.1556972279.log

运行后显示如下:

dolphin@tools:~$ cat Desktop/logs/access.1556972279.log 
2400,22,vm-914,7,5,49.2,8.9,0.8,10.0,4.9,0.4,4.7,0.3,4.6,4.3,4.8,2.1,0.3,6.9,2019-09-20 23:50:00
7361,22,vm-914,17,0,58.3,1.4,9.4,1.4,10.6,1.7,5.0,9.4,8.2,10.2,5.4,0.2,8.7,3.8,2019-09-20 23:51:00
7363,22,vm-914,9,1,49.7,10.5,6.3,9.5,0.1,10.4,2.8,8.5,4.7,0.2,7.5,5.4,10.7,6.6,2019-09-20 23:52:00
7361,22,vm-914,1,2,41.0,5.4,8.5,8.2,3.2,3.5,0.8,2.5,9.0,4.5,9.6,6.6,2.9,7.2,2019-09-20 23:53:00
7361,22,vm-914,21,4,55.2,3.8,9.9,4.0,6.4,5.6,8.2,6.9,6.5,1.9,5.6,6.3,3.2,2.6,2019-09-20 23:54:00
7361,22,vm-914,4,0,57.5,7.6,1.9,2.3,10.7,9.6,1.2,6.7,10.4,5.9,7.2,2.8,10.7,4.7,2019-09-20 23:55:00
7363,22,vm-914,6,3,41.4,5.4,7.7,3.0,0.5,0.5,4.3,1.5,9.8,3.8,5.9,8.1,6.4,0.1,2019-09-20 23:56:00
7361,22,vm-914,20,0,55.9,6.8,6.7,4.6,10.0,10.3,9.2,0.6,3.3,0.9,9.5,4.1,7.8,7.4,2019-09-20 23:57:00
7361,22,vm-914,9,4,54.4,6.6,4.5,7.4,8.9,4.7,2.2,5.3,7.2,2.7,5.8,3.4,7.8,10.6,2019-09-20 23:58:00
7363,22,vm-914,10,2,44.3,3.6,3.7,0.6,9.6,4.7,4.6,7.0,5.9,6.4,5.0,7.4,7.7,3.7,2019-09-20 23:59:00

6.创建HDFS目录

在命令行窗口输入下面的命令:

hadoop fs -mkdir /log1 /log2

运行后显示如下,此时在HDFS的根目录下创建了log1和log2两个文件夹

dolphin@tools:~$ hadoop fs -mkdir /log1 /log2

7.put命令上传文件

在命令行窗口输入下面的命令:

hadoop fs -put Desktop/logs/* /log1

运行后显示如下,此时Desktop/logs/目录下的log文件已经上传到HDFS的/log1目录中

dolphin@tools:~$ hadoop fs -put Desktop/logs/* /log1

8.展示HDFS目录下所有文件

在命令行窗口输入下面的命令:

hadoop fs -ls /log1

运行后显示如下,此时列出了HDFS上/log目录下所有文件,及其文件的各种信息

dolphin@tools:~$ hadoop fs -ls /log1
Found 1 items
-rw-r--r--   1 dolphin supergroup        976 2019-11-17 14:47 /log1/access.1556972279.log

9.使用cat命令查看HDFS文件内容

在命令行窗口输入下面的命令:

hadoop fs -cat /log1/*

运行后显示如下,读取了HDFS文件内容

dolphin@tools:~$ hadoop fs -cat /log1/*
2400,22,vm-914,7,5,49.2,8.9,0.8,10.0,4.9,0.4,4.7,0.3,4.6,4.3,4.8,2.1,0.3,6.9,2019-09-20 23:50:00
7361,22,vm-914,17,0,58.3,1.4,9.4,1.4,10.6,1.7,5.0,9.4,8.2,10.2,5.4,0.2,8.7,3.8,2019-09-20 23:51:00
7363,22,vm-914,9,1,49.7,10.5,6.3,9.5,0.1,10.4,2.8,8.5,4.7,0.2,7.5,5.4,10.7,6.6,2019-09-20 23:52:00
7361,22,vm-914,1,2,41.0,5.4,8.5,8.2,3.2,3.5,0.8,2.5,9.0,4.5,9.6,6.6,2.9,7.2,2019-09-20 23:53:00
7361,22,vm-914,21,4,55.2,3.8,9.9,4.0,6.4,5.6,8.2,6.9,6.5,1.9,5.6,6.3,3.2,2.6,2019-09-20 23:54:00
7361,22,vm-914,4,0,57.5,7.6,1.9,2.3,10.7,9.6,1.2,6.7,10.4,5.9,7.2,2.8,10.7,4.7,2019-09-20 23:55:00
7363,22,vm-914,6,3,41.4,5.4,7.7,3.0,0.5,0.5,4.3,1.5,9.8,3.8,5.9,8.1,6.4,0.1,2019-09-20 23:56:00
7361,22,vm-914,20,0,55.9,6.8,6.7,4.6,10.0,10.3,9.2,0.6,3.3,0.9,9.5,4.1,7.8,7.4,2019-09-20 23:57:00
7361,22,vm-914,9,4,54.4,6.6,4.5,7.4,8.9,4.7,2.2,5.3,7.2,2.7,5.8,3.4,7.8,10.6,2019-09-20 23:58:00
7363,22,vm-914,10,2,44.3,3.6,3.7,0.6,9.6,4.7,4.6,7.0,5.9,6.4,5.0,7.4,7.7,3.7,2019-09-20 23:59:00

10.使用text命令查看HDFS文件内容

在命令行窗口输入下面的命令:

hadoop fs -text /log1/*

运行后显示如下,读取了HDFS文件内容

dolphin@tools:~$ hadoop fs -text /log1/*
2400,22,vm-914,7,5,49.2,8.9,0.8,10.0,4.9,0.4,4.7,0.3,4.6,4.3,4.8,2.1,0.3,6.9,2019-09-20 23:50:00
7361,22,vm-914,17,0,58.3,1.4,9.4,1.4,10.6,1.7,5.0,9.4,8.2,10.2,5.4,0.2,8.7,3.8,2019-09-20 23:51:00
7363,22,vm-914,9,1,49.7,10.5,6.3,9.5,0.1,10.4,2.8,8.5,4.7,0.2,7.5,5.4,10.7,6.6,2019-09-20 23:52:00
7361,22,vm-914,1,2,41.0,5.4,8.5,8.2,3.2,3.5,0.8,2.5,9.0,4.5,9.6,6.6,2.9,7.2,2019-09-20 23:53:00
7361,22,vm-914,21,4,55.2,3.8,9.9,4.0,6.4,5.6,8.2,6.9,6.5,1.9,5.6,6.3,3.2,2.6,2019-09-20 23:54:00
7361,22,vm-914,4,0,57.5,7.6,1.9,2.3,10.7,9.6,1.2,6.7,10.4,5.9,7.2,2.8,10.7,4.7,2019-09-20 23:55:00
7363,22,vm-914,6,3,41.4,5.4,7.7,3.0,0.5,0.5,4.3,1.5,9.8,3.8,5.9,8.1,6.4,0.1,2019-09-20 23:56:00
7361,22,vm-914,20,0,55.9,6.8,6.7,4.6,10.0,10.3,9.2,0.6,3.3,0.9,9.5,4.1,7.8,7.4,2019-09-20 23:57:00
7361,22,vm-914,9,4,54.4,6.6,4.5,7.4,8.9,4.7,2.2,5.3,7.2,2.7,5.8,3.4,7.8,10.6,2019-09-20 23:58:00
7363,22,vm-914,10,2,44.3,3.6,3.7,0.6,9.6,4.7,4.6,7.0,5.9,6.4,5.0,7.4,7.7,3.7,2019-09-20 23:59:00

11.使用tail命令查看HDFS文件末尾内容

在命令行窗口输入下面的命令:

hadoop fs -tail /log1/access.1556972279.log

运行后显示如下,读取了HDFS文件末尾内容,这里由于文件内容过短,还是打印了所有内容

dolphin@tools:~$ hadoop fs -tail /log1/access.1556972279.log
2400,22,vm-914,7,5,49.2,8.9,0.8,10.0,4.9,0.4,4.7,0.3,4.6,4.3,4.8,2.1,0.3,6.9,2019-09-20 23:50:00
7361,22,vm-914,17,0,58.3,1.4,9.4,1.4,10.6,1.7,5.0,9.4,8.2,10.2,5.4,0.2,8.7,3.8,2019-09-20 23:51:00
7363,22,vm-914,9,1,49.7,10.5,6.3,9.5,0.1,10.4,2.8,8.5,4.7,0.2,7.5,5.4,10.7,6.6,2019-09-20 23:52:00
7361,22,vm-914,1,2,41.0,5.4,8.5,8.2,3.2,3.5,0.8,2.5,9.0,4.5,9.6,6.6,2.9,7.2,2019-09-20 23:53:00
7361,22,vm-914,21,4,55.2,3.8,9.9,4.0,6.4,5.6,8.2,6.9,6.5,1.9,5.6,6.3,3.2,2.6,2019-09-20 23:54:00
7361,22,vm-914,4,0,57.5,7.6,1.9,2.3,10.7,9.6,1.2,6.7,10.4,5.9,7.2,2.8,10.7,4.7,2019-09-20 23:55:00
7363,22,vm-914,6,3,41.4,5.4,7.7,3.0,0.5,0.5,4.3,1.5,9.8,3.8,5.9,8.1,6.4,0.1,2019-09-20 23:56:00
7361,22,vm-914,20,0,55.9,6.8,6.7,4.6,10.0,10.3,9.2,0.6,3.3,0.9,9.5,4.1,7.8,7.4,2019-09-20 23:57:00
7361,22,vm-914,9,4,54.4,6.6,4.5,7.4,8.9,4.7,2.2,5.3,7.2,2.7,5.8,3.4,7.8,10.6,2019-09-20 23:58:00
7363,22,vm-914,10,2,44.3,3.6,3.7,0.6,9.6,4.7,4.6,7.0,5.9,6.4,5.0,7.4,7.7,3.7,2019-09-20 23:59:00

至此,本实验结束啦。开始下一个实验吧。