所需環境:四台主機(筆者用四台VMware虛擬機代替),centos6.5系統,hadoop-2.7.1軟件包,jdk1.8.0_91
准備工作:創建四台虛擬主機,使用NAT模式訪問網絡。
1)在四台虛擬機中安裝好hadoop和jdk軟件;
2)更改每個主機的主機名:
[master]下:gedit/etc/sysconfig/network
NETWORKING=yes
HOSTNAME=master
NTPSERVERARGS=iburst
[slave1]下:gedit/etc/sysconfig/network
NETWORKING=yes
HOSTNAME=slave1
NTPSERVERARGS=iburst
[slave2]下:gedit/etc/sysconfig/network
NETWORKING=yes
HOSTNAME=slave2
NTPSERVERARGS=iburst
[slave3]下:gedit/etc/sysconfig/network
NETWORKING=yes
HOSTNAME=slave3
NTPSERVERARGS=iburst
3)更改各個主機的hosts文件:
[zq@master~]$ sudo gedit /etc/hosts
[sudo]password for zq:
在hosts文件中加入:
192.168.44.142 master
192.168.44.140 slave1
192.168.44.143 slave2
192.168.44.141 slave3
同理,在slave1,slave2,slave3中的hosts文件中添加同樣ip。
4)關閉各個主機的防火牆:sudoservice iptables stop
開始配置安裝:
1.設置免密碼登錄,如上圖的結構圖所示,ssh免密碼登錄使master可以免密碼訪問slave1,slave2,slave3即可,這裡不進行詳細解說。
2.在hadoop的配置文件中新建文件fairscheduler.xml:
[zq@master ~]$ cd/home/zq/soft/hadoop-2.7.1/etc/hadoop/
[zq@master hadoop]$ touch fairscheduler.xml
[zq@master hadoop]$ gedit fairscheduler.xml
配置fairscheduler.xml文件:
<?xml version="1.0"?> <allocations> <queue name="infrastructure"> <minResources>102400 mb, 50 vcores</minResources> <maxResources>153600mb, 100 vcores</maxResources> <maxRunningApps>200</maxRunningApps> <minSharePreemptionTimeout>300</minSharePreemptionTimeout> <weight>1.0</weight> <aclSubmitApps>root,yarn,search,hdfs,zq</aclSubmitApps> </queue> <queue name="tools"> <minResources>102400 mb,30 vcores</minResources> <maxResources>153600 mb, 50 vcores</maxResources> </queue> <queue name="sentiment"> <minResources>102400 mb,30 vcores</minResources> <maxResources>153600 mb, 50 vcores</maxResources> </queue> </allocations>3.配置hadoop文件:core-site.xml:
[zq@master hadoop-2.7.1]$ sudo geditcore core-site.xml
<configuration> <property> <name>fs.default.name</name> <value>hdfs://master:8020</value> <description>master=8020</description> </property> </configuration>fs.default.name是定義master的url和端口號,讀者可以將master改為自己設置的主機名或地址。
4.配置hadoop文件:hdfs-site.xml:
[zq@master hadoop-2.7.1]$ sudo gedithdfs-site.xml
<configuration><span > 分配集群cluster1,cluster2</span> <property> <name>dfs.nameservices</name> <value>cluster1,cluster2</value> <description>nameservices</description> </property> <!-- config cluster1--> <span >為集群cluster1分配NameNode,名為nn1和nn2</span> <property> <name>dfs.ha.namenodes.cluster1</name> <value>nn1,nn2</value> <description>namenodes.cluster1</description> </property> <property> <name>dfs.namenode.rpc-address.cluster1.nn1</name> <value>master:8020</value> <description>rpc-address.cluster1.nn1</description> </property> <property> <name>dfs.namenode.rpc-address.cluster1.nn2</name> <value>slave1:8020</value> <description>rpc-address.cluster1.nn2</description> </property> <property> <name>dfs.namenode.http-address.cluster1.nn1</name> <value>master:50070</value> <description>http-address.cluster1.nn1</description> </property> <property> <name>dfs.namenode.http-address.cluster1.nn2</name> <value>slave1:50070</value> <description>http-address.cluster1.nn2</description> </property> <!-- config cluster2--> <span >為集群cluster2分配NameNode,名為nn3和nn4</span> <property> <name>dfs.ha.namenodes.cluster2</name> <value>nn3,nn4</value> <description>namenodes.cluster2</description> </property> <property> <name>dfs.namenode.rpc-address.cluster2.nn3</name> <value>slave2:8020</value> <description>rpc-address.cluster2.nn3</description> </property> <property> <name>dfs.namenode.rpc-address.cluster2.nn4</name> <value>slave3:8020</value> <description>rpc-address.cluster2.nn4</description> </property> <property> <name>dfs.namenode.http-address.cluster2.nn3</name> <value>slave2:50070</value> <description>http-address.cluster2.nn3</description> </property> <property> <name>dfs.namenode.http-address.cluster2.nn4</name> <value>slave3:50070</value> <description>http-address.cluster2.nn4</description> </property> <!--namenode dirs--> <property> <name>dfs.namenode.name.dir</name> <value>file:///home/zq/soft/hadoop-2.7.1/hdfs/name</value> <description>dfs.namenode.name.dir</description> </property> <span >創建Name存儲路徑,讀者需更改為自己的路徑</span> <property> <name>dfs.namenode.shared.edits.dir</name> <value> qjournal://slave1:8485;slave2:8485;slave3:8485/cluster1 </value> <description>shared.edits.dir</description> </property> <span >**這裡必須注意,根據結構圖,master和slave1共享cluster1,slave2和slave3共享cluster2</span> <property> <name>dfs.namenode.data.dir</name> <value>file:///home/zq/soft/hadoop-2.7.1/hdfs/data</value> <description>data.dir</description> </property> <span >創建data路徑</span> </configuration>5.配置hadoop文件:yarn-site.xml:
[zq@master hadoop-2.7.1]$ sudo gedit yarn-site.xml
<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> <description>hostname=master</description> </property> <property> <name>yarn.resourcemanager.address</name> <value>${yarn.resourcemanager.hostname}:8032</value> <description>address=master:8032</description> </property> <property> <name>yarn.resourcemanager.scheduler.addresss</name> <value>${yarn.resourcemanager.hostname}:8030</value> <description>scheduler.address=master:8030</description> </property> <property> <name>yarn.resourcemanager.webapp.addresss</name> <value>${yarn.resourcemanager.hostname}:8088</value> <description>webapp.address=master:8088</description> </property> <property> <name>yarn.resourcemanager.webapp.https.addresss</name> <value>${yarn.resourcemanager.hostname}:8090</value> <description>https.addresss=8090</description> </property> <property> <name>yarn.resourcemanager.resource-tracker.addresss</name> <value>${yarn.resourcemanager.hostname}:8031</value> </property> <property> <name>yarn.resourcemanager.admin.addresss</name> <value>${yarn.resourcemanager.hostname}:8033</value> <description>admin.addresss</description> </property> <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value> <description>org.apache.hadoop.yarn.server.resourcemanager</description> </property> <property> <name>yarn.scheduler.fair.allocation.file</name> <value>${yarn.home.dir}/etc/hadoop/fairscheduler.xml</value> <description>scheduler.fair.allocation.file=.xml</description> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/home/zq/soft/hadoop-2.7.1/yarn/local</value> <description>nodemanager.local-dirs</description> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> <description>yarn.log-aggregation-enable</description> </property> <property> <name>yarn.nodemanager.remote-app-log-dir</name> <value>/tmp/logs</value> <description>remote-app-log-dir=/tmp/logs</description> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>30720</value> <description>resource.memory-mb=30720</description> </property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>12</value> <description>resource.cpu-vcores</description> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce-shuffle</value> <description>aux-services</description> </property> </configuration>6.配置hadoop文件:mapred-site.xml
[zq@master hadoop-2.7.1]$ sudo gedit mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> <description>yarn</description> </property> <property> <name>mapreduce.jobhistory.address</name> <value>slave1:10020</value> <description>slave1=10020</description> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>slave1:19888</value> <description>slave2=19888</description> </property> </configuration>7.配置hadoop文件:hadoop-env.sh
[zq@master hadoop-2.7.1]$ sudo gedit hadoop-env.sh
# The java implementation to use.
export JAVA_HOME=/home/zq/soft/jdk1.8.0_91
8. 配置hadoop文件:slaves
[zq@master hadoop-2.7.1]$ gedit slaves
slave1
slave2
slave3
9.利用scp命令將以上所有配置文件拷貝到slave1,slave2,slave3中:
[zq@master etc]$ scp hadoop/*zq@slave1:/home/zq/soft/hadoop-2.7.1/etc/hadoop
[zq@master etc]$ scp hadoop/*zq@slave2:/home/zq/soft/hadoop-2.7.1/etc/hadoop
[zq@master etc]$ scp hadoop/*zq@slave3:/home/zq/soft/hadoop-2.7.1/etc/hadoop
10.至此,hadoop的所有配置已經完成。接下來開始啟動hadoop服務:
11.簡單pi測試
[zq@master hadoop-2.7.1]bin/hadoop jar
share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jarpi 2 1000