zl程序教程

您现在的位置是:首页 >  其他

当前栏目

Hadoop安装实战

安装hadoop 实战
2023-09-14 09:01:04 时间
安装hadoop 准备机器:一台master,若干台slave,配置每台机器的/etc/hosts保证各台机器之间通过机器名可以互访,例如:
      172.16.200.4  node1(master)   
      172.16.200.5  node2 (slave1)   
      172.16.200.6  node3 (slave2)
     主机信息:  


修改 /etc/sudoers 文件,找到下面一行,在root下面添加一行,如下所示: root    ALL=(ALL)     ALL hadoop  ALL=(ALL)     ALL 三、配置免密码登陆(三台都配置)

在Hadoop启动以后,Namenode是通过SSH(Secure Shell)来启动和停止各个datanode上的各种守护进程的,这就须要在节点之间执行指令的时候是不须要输入密码的形式,故我们须要配置SSH运用无密码公钥认证的形式。
以本文中的三台机器为例,现在node1是主节点,他须要连接node2和node3。须要确定每台机器上都安装了ssh,并且datanode机器上sshd服务已经启动。

( 说明:hadoop@hadoop~]$ssh-keygen  -t  rsa
这个命令将为hadoop上的用户hadoop生成其密钥对,询问其保存路径时直接回车采用默认路径,当提示要为生成的密钥输入passphrase的时候,直接回车,也就是将其设定为空密码。生成的密钥对id_rsa,id_rsa.pub,默认存储在/home/hadoop/.ssh目录下然后将id_rsa.pub的内容复制到每个机器(也包括本机)的/home/dbrg/.ssh/authorized_keys文件中,如果机器上已经有authorized_keys这个文件了,就在文件末尾加上id_rsa.pub中的内容,如果没有authorized_keys这个文件,直接复制过去就行.)

四、 安装jdk (三台都安装)

export JAVA_HOME=/usr/java/jdk1.7.0_67

exportPATH=$PATH:$HADOOP_HOME/bin

export JRE_HOME=/usr/java/jdk1.7.0_67/jre

source /etc/profile


五、安装Hadoop
这是下载后的hadoop-2.6.4.tar.gz压缩包,   

1、解压 tar -xzvf hadoop-2.6.4.tar.gz 
[hadoop@node1 hadoop-2.6.4]$ ls bin  data  etc  include  lib  libexec  LICENSE.txt  logs  name  NOTICE.txt  README.txt  sbin  share  var [hadoop@node1 hadoop-2.6.4]$ pwd /home/hadoop/hadoop-2.6.4 2、配置之前,先在本地文件系统创建以下文件夹:
3、编辑配置文件主要涉及的配置文件有7个:都在/home/hadoop/hadoop-2.6.4/etc/hadoop文件夹下


   03   node3
4.4、配置 core-site.xml文件-- 增加hadoop核心配置(hdfs文件端口是9000、file: /home/hadoop/hadoop-2.6.4/var  )
  value file:/home/hadoop/hadoop-2.6.4/var /value   description Abasefor other temporary directories. /description /property property   name hadoop.proxyuser.spark.hosts /name   value * /value /property property   name hadoop.proxyuser.spark.groups /name   value * /value /property /configuration 4.5、配置  hdfs-site.xml 文件-- 增加hdfs配置信息(namenode、datanode端口和目录位置)
configuration property   name dfs.namenode.secondary.http-address /name   value node1:9001 /value /property   property     name dfs.namenode.name.dir /name     value file:/home/hadoop/hadoop-2.6.4/name /value /property property   name dfs.datanode.data.dir /name   value file:/home/hadoop/hadoop-2.6.4/data /value   /property property   name dfs.replication /name   value 3 /value /property property   name dfs.webhdfs.enabled /name   value true /value /property /configuration 4.6、配置  mapred-site.xml 文件-- 增加mapreduce配置(使用yarn框架、jobhistory使用地址以及web地址)
    name yarn.nodemanager.aux-services.mapreduce.shuffle.class /name     value org.apache.hadoop.mapred.ShuffleHandler /value   /property   property     name yarn.resourcemanager.address /name     value node1:8032 /value   /property   property     name yarn.resourcemanager.scheduler.address /name     value node1:8030 /value   /property   property     name yarn.resourcemanager.resource-tracker.address /name     value node1:8035 /value   /property   property     name yarn.resourcemanager.admin.address /name     value node1:8033 /value   /property   property     name yarn.resourcemanager.webapp.address /name     value node1:8088 /value   /property /configuration 5、将配置好的hadoop文件copy到另两台slave机器上 scp -r hadoop-2.6.4 hadoop@node2:/home/hadoop/ scp -r hadoop-2.6.4 hadoop@node3:/home/hadoop/
四、验证

1、格式化namenode:

bin  dfs  etc  include  input  lib  libexec  LICENSE.txt  logs  NOTICE.txt  README.txt  sbin  share  tmp

15/01/05 16:41:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [S1PA11]
S1PA11: starting namenode, logging to /home/spark/opt/hadoop-2.6.0/logs/hadoop-spark-namenode-S1PA11.out
S1PA222: starting datanode, logging to /home/spark/opt/hadoop-2.6.0/logs/hadoop-spark-datanode-S1PA222.out
Starting secondary namenodes [S1PA11]
S1PA11: starting secondarynamenode, logging to /home/spark/opt/hadoop-2.6.0/logs/hadoop-spark-secondarynamenode-S1PA11.out
15/01/05 16:41:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/01/05 16:40:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [S1PA11]
S1PA11: stopping namenode
S1PA222: stopping datanode
Stopping secondary namenodes [S1PA11]
S1PA11: stopping secondarynamenode
15/01/05 16:40:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting resourcemanager, logging to /home/spark/opt/hadoop-2.6.0/logs/yarn-spark-resourcemanager-S1PA11.out
S1PA222: starting nodemanager, logging to /home/spark/opt/hadoop-2.6.0/logs/yarn-spark-nodemanager-S1PA222.out
[hadoop@node1 hadoop-2.6.4]$ ./bin/hdfs dfsadmin -report 16/05/26 10:51:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Configured Capacity: 56338194432 (52.47 GB) Present Capacity: 42922237952 (39.97 GB) DFS Remaining: 42922164224 (39.97 GB) DFS Used: 73728 (72 KB) DFS Used%: 0.00% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 ------------------------------------------------- Live datanodes (3): Name: 172.16.200.4:50010 (node1) Hostname: node1 Decommission Status : Normal Configured Capacity: 18779398144 (17.49 GB) DFS Used: 24576 (24 KB) Non DFS Used: 4559396864 (4.25 GB) DFS Remaining: 14219976704 (13.24 GB) DFS Used%: 0.00% DFS Remaining%: 75.72% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Thu May 26 10:51:35 CST 2016 Name: 172.16.200.5:50010 (node2) Hostname: node2 Decommission Status : Normal Configured Capacity: 18779398144 (17.49 GB) DFS Used: 24576 (24 KB) Non DFS Used: 4369121280 (4.07 GB) DFS Remaining: 14410252288 (13.42 GB) DFS Used%: 0.00% DFS Remaining%: 76.73% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Thu May 26 10:51:35 CST 2016 Name: 172.16.200.6:50010 (node3) Hostname: node3 Decommission Status : Normal Configured Capacity: 18779398144 (17.49 GB) DFS Used: 24576 (24 KB) Non DFS Used: 4487438336 (4.18 GB) DFS Remaining: 14291935232 (13.31 GB) DFS Used%: 0.00% DFS Remaining%: 76.10% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Thu May 26 10:51:35 CST 2016 7、查看hdfs:http://172.16.200.4:50070/


8、查看RM:http://172.16.200.4:8088/

大数据Spark企业级实战与Hadoop实战&PDF和PPT 今天给大家分享的是《大数据Spark企业级实战》与《Hadoop实战》《大数据处理系统·Hadoop源代码情景分析》《50个大厂大数据算法教程》等销量排行前10名的大数据技术书籍(文末领取PDF版)。这些书籍具有以下几个优点:易读、实践性强,对解决工作中遇到的业务问题具有一定启发性。