簡單粗暴易懂
系統環境hadoop 2.6.0
zookeeper 3.4.5
CentOS 6.5
| header 1 | header 2 | row 1 col 1row 1 col 2row 2 col 1row 2 col 2
| 主機名 | ip | null | canbot130192.168.186.130active NameNode,datanode,Zookeeper,ResourceManager,nodeManagercanbot131192.168.186.131Standby NameNode,dataNode,Zookeeper,nodeManagercanbot132192.168.186.132Standby NameNode,dataNode,Zookeeper,nodeManager
關閉防火牆
[code]chkconfig iptables off
查看各個節點防火牆是否關閉
[code][hadoop@canbot130 ~]$ sudo service iptables status
"iptables: Firewall is not running."
新增角色 ssh配置
具體hadoop角色新增和ssh配置
Zookeeper
下載zookeeper
解壓
[code]tar -zxvf zookeeper-3.4.6.tar.gz -C /home/hadoop/development/src/
修改zookeeper配置 我們需要將conf/zoo_sample.cfg負責一份命名為”zoo.cfg”
[code]cp ./conf/zoo_sample.cfg ./conf/zoo.cfg
其中主要需要修改的內容。
[code]tickTime=2000
initLimit=10
syncLimit=5
dataDir=/home/hadoop/development/src/zookeeper-3.4.5-cdh5.6.0/data
dataLogDir=/home/hadoop/development/src/zookeeper-3.4.5-cdh5.6.0/logs
clientPort=2181 #鏈接端口
server.1=192.168.186.130:2888:3888
server.2=192.168.186.131:2888:3888
server.3=192.168.186.132:2888:3888
# 我這裡直接使用ip地址,也可以使用'主機名'
配置好以後發送到 各個節點並且”創建logs目錄和 myid”
[code]#創建myid
echo 1 > /home/hadoop/development/src/zookeeper-3.4.5-cdh5.6.0/data/myid
#ip 192.168.186.131 下,他創建的myid就是"2" 根據:zoo.cfg 中的 server.2=192.168.186.131:2888:3888 設置的。
配置Hadoop HA
其中主要修改 core-site.xml and hdfs-site.xml
core-site.xml
[code]<configuration>
<property>
<!-- 配置 hadoop NameNode ip地址 ,由於我們配置的 HA 那麼有兩個namenode 所以這裡配置的地址必須是動態的-->
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<!-- 整合 Zookeeper -->
<name>ha.zookeeper.quorum</name>
<value>canbot130:2181,canbot131:2181,canbot132:2181</value>
</property>
<property>
<!-- 配置hadoop緩存地址 -->
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/development/src/hadoop-2.6.0-cdh5.6.0/tmp</value>
</property>
</configuration>
hdfs-site.xml
[code]<configuration>
<!--命名空間設置ns1-->
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<!--namenodes節點ID:nn1,nn2(配置在命名空間mycluster下)-->
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<!--nn1,nn2節點地址配置-->
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>canbot130:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>canbot131:8020</value>
</property>
<!--nn1,nn2節點WEB地址配置-->
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>canbot130:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>canbot131:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://canbot130:8485;canbot131:8485;canbot132:8485/mycluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/hadoop/development/src/hadoop-2.6.0-cdh5.6.0/tmp/dfs/journalnode</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/development/src/hadoop-2.6.0-cdh5.6.0/tmp/dfs/name</value>
</property>
<property>
<name>dfs.namenode.data.dir</name>
<value>/home/hadoop/development/src/hadoop-2.6.0-cdh5.6.0/tmp/dfs/data</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<!--啟用自動故障轉移-->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.replication.max</name>
<value>32767</value>
</property>
</configuration>
mapredu-site.xml
[code]<configuration>
<property>
<name>mapreduce.framwork.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
[code]<configuration>
<!-- Site specific YARN configuration properties-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>canbot130</value>
</property>
</configuration>
啟動Zookeeper
在canbot130 131 132節點上依次啟動
[code]zkServer.sh start
在canbot130節點上格式化Zookeeper
[code]hdfs zkfc -formatZK
驗證zkfc是否格式化成功,如果多了一個hadoop-ha包就是成功了。
[code]zkCli.sh
[zk:
localhost:2181(CONNECTED)0]
ls/
"[hadoop-ha,zookeeper]"
[zk:localhost:2181(CONNECTED)1]
啟動JournalNode集群
依次在canbot130,131,132上面執行
[code]hadoop-daemon.sh start journalnode
格式化集群的一個NameNode
[code]hdfs namenode –format
啟動剛剛格式化的namenode
[code]hadoop-daemon.sh start namenode
執行命令後,浏覽:http://canbot130:50070/dfshealth.jsp可以看到狀態
在canbot131機器上,將canbot131的數據復制到canbot131上來,在canbot131上執行
[code]hdfs namenode –bootstrapStandby
啟動canbot131 namenode
[code]hadoop-daemon.sh start namenode
在canbot130上啟動YARN
[code]start-yarn.sh
然後浏覽:http://canbot130:8088/cluster, 可以看到效果
啟動zkfc
啟動 ZooKeeperFailoverCotroller,在canbot130,canbot131機器上依次執行以下命令,這個時候再浏覽50070端口,可以發現canbot130變成active狀態了,而canbot131還是standby狀態
[code]hadoop-daemon.sh start zkfc