官方安裝文檔:http://www.cloudera.com/documentation/enterprise/5-6-x/topics/installation.html
相關包的下載地址:
Cloudera Manager地址:http://archive.cloudera.com/cm5/cm/5/
CDH安裝包地址:http://archive.cloudera.com/cdh5/parcels/5.6.0/
由於我們的操作系統為ubuntu14.04,需要下載以下文件:
[code]CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel
CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel.sha1
manifest.json
全程采用root安裝
機器配置
1. 三台機器的ip和名字為
192.168.10.236 hadoop-1 (內存16G)
192.168.10.237 hadoop-2 (內存8G)
192.168.10.238 hadoop-3 (內存8G)
我們將hadoop-1作為主節點
2. 配置/etc/hosts,使節點間通過 hadoop-X 即可訪問其他節點
3. 配置主節點root免密碼登錄到其他節點(不需要從節點到主節點)
3.1 在hadoop-1上執行ssh-keygen -t rsa -P ''
生成無密碼密鑰對
3.2 將公鑰添加到認證文件中:cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
3.3 將認證文件拷貝到hadoop-2和hadoop-3的/root/.ssh/
目錄下,使主節點免密碼訪問從節點
4. 配置jdk
4.1 安裝oracle-j2sdk1.7版本(主從都要,根據CDH版本選擇對應的jdk)
[code]$ apt-get install oracle-j2sdk1.7
$ update-alternatives --install /usr/bin/java java /usr/lib/jvm/java-7-oracle-cloudera/bin/java 300
$ update-alternatives --install /usr/bin/javac javac /usr/lib/jvm/java-7-oracle-cloudera/bin/javac 300
4.2 $ vim /etc/profile
在末尾添加
[code]export JAVA_HOME=/usr/lib/jvm/java-7-oracle-cloudera
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=$PATH:${JAVA_HOME}/bin:{JRE_HOME}/bin:$PATH
4.3 $ vim /root/.bashrc
在末尾添加
[code]source /etc/profile
5. 安裝MariaDB-5.5(兼容性請查看官方文檔,具體可參照ubuntu14.04 安裝MariaDB10.0並允許遠程訪問)
5.1 執行$ apt-get install mariadb-server-5.5
5.2 數據庫設置(官方建議設置)
[code]$ vim /etc/mysql/my.cnf
下面是官方建議的配置
[code][mysqld]
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
# symbolic-links = 0
key_buffer = 16M
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space. Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your system
#and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
重啟服務,
service mysql restart
5.3 創建相關數據庫
[code]進入mysql命令行:$ mysql -u root -p
進入mysql命令行後,直接復制下面的整段話並粘貼:
create database amon DEFAULT CHARACTER SET utf8;
grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'amon_password';
grant all on amon.* TO 'amon'@'CDH' IDENTIFIED BY 'amon_password';
create database smon DEFAULT CHARACTER SET utf8;
grant all on smon.* TO 'smon'@'%' IDENTIFIED BY 'smon_password';
grant all on smon.* TO 'smon'@'CDH' IDENTIFIED BY 'smon_password';
create database rman DEFAULT CHARACTER SET utf8;
grant all on rman.* TO 'rman'@'%' IDENTIFIED BY 'rman_password';
grant all on rman.* TO 'rman'@'CDH' IDENTIFIED BY 'rman_password';
create database hmon DEFAULT CHARACTER SET utf8;
grant all on hmon.* TO 'hmon'@'%' IDENTIFIED BY 'hmon_password';
grant all on hmon.* TO 'hmon'@'CDH' IDENTIFIED BY 'hmon_password';
create database hive DEFAULT CHARACTER SET utf8;
grant all on hive.* TO 'hive'@'%' IDENTIFIED BY 'hive_password';
grant all on hive.* TO 'hive'@'CDH' IDENTIFIED BY 'hive_password';
create database oozie DEFAULT CHARACTER SET utf8;
grant all on oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie_password';
grant all on oozie.* TO 'oozie'@'CDH' IDENTIFIED BY 'oozie_password';
create database metastore DEFAULT CHARACTER SET utf8;
grant all on metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive_password';
grant all on metastore.* TO 'hive'@'CDH' IDENTIFIED BY 'hive_password';
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'gaoying' WITH GRANT OPTION;
flush privileges;
5.4 安裝MariaDB jdbc 驅動
[code]$ apt-get install libmysql-java
5.5 使用cloudera腳本在mysql中進行相關配置:
[code]$ /opt/cloudera-manager/cm-5.4.8/share/cmf/schema/scm_prepare_database.sh mysql -uroot -p --scm-host localhost scm scm scm_password
6. 安裝Cloudera Manager Server 和 Agents
6.1 把安裝包解壓到主節點
$ mkdir /opt/cloudera-manager
將下載好的
cloudera-manager-trusty-cm5.6.0_amd64.tar.gz
解壓到
/opt/cloudera-manager
6.2 創建用戶
[code]$ sudo useradd --system --home=/opt/cloudera-manager/cm-5.6.0/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
//--home 指向你cloudera-scm-server的路徑
6.3 創建cloudera manager server的本地數據存儲目錄(主)
[code]$ sudo mkdir /var/log/cloudera-scm-server
$ sudo chown cloudera-scm:cloudera-scm /var/log/cloudera-scm-server
6.4 在每個Cloudera Manager Agent 節點配置server_host(主從都要)
[code]$ vim /opt/cloudera-manager/cm-5.6.0/etc/cloudera-scm-agent/config.ini
//吧server_host改成主節點名稱
server_host=hadoop-1
6.5 將cloudera-manager
發送到各從節點對應的目錄下(即/opt
)
[code]$ scp -r /opt/cloudera-manager root@hadoop-2:/opt
6.6 創建Parcel目錄
6.6.1 在主節點上:
創建安裝包目錄
mkdir -p /opt/cloudera/parcel-repo
將CHD5相關的Parcel包放到主節點的/opt/cloudera/parcel-repo/目錄中
[code]CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel
CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel.sha1
manifest.json
最後將CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel.sha1,重命名為CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel.sha,這點必須注意,否則,系統會重新下載CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel.sha1文件。
6.6.2 在從節點上:mkdir -p /opt/cloudera/parcels
6.7 在各節點上安裝依賴
用
apt-get install
安裝以下依賴
[code]lsb-base
psmisc
bash
libsasl2-modules
libsasl2-modules-gssapi-mit
zlib1g
libxslt1.1
libsqlite3-0
libfuse2
fuse-utils or fuse
rpcbind
6.8 啟動server和agent
主節點上:
[code]/opt/cloudera-manager/cm-5.6.0/etc/init.d/cloudera-scm-server start
/opt/cloudera-manager/cm-5.6.0/etc/init.d/cloudera-scm-agent stop
從節點上:
[code]/opt/cloudera-manager/cm-5.6.0/etc/init.d/cloudera-scm-agent stop
若啟動出錯,可以查看
/opt/cloudera-manager/cm-5.6.0/log
裡的日志
若沒錯,則等待幾分鐘後,在浏覽器訪問
Cloudera Manager Admin Console
,我的主節點ip為
192.168.10.236
,那麼訪問
http://192.168.10.236:7180
,默認的用戶名和密碼為admin
7 CDH5的安裝和配置
在new hosts選項卡中用hadoop-[1-3]
查找組成集群的主機名
勾選要安裝的節點,點繼續
出現以下包名,說明本地Parcel包配置無誤,直接點繼續
安裝所有服務
采取默認值
數據庫配置
這裡也默認
等待安裝
安裝成功
8 簡單測試
8.1 在hadoop上執行MapReduce job
在主節點終端執行
[code]sudo -u hdfs hadoop jar \
/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
pi 10 100
終端會輸出任務執行情況
[code]root@hadoop-1:~# sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 100
Number of Maps = 10
Samples per Map = 100
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
16/05/18 21:26:58 INFO client.RMProxy: Connecting to ResourceManager at hadoop-1/192.168.10.236:8032
16/05/18 21:26:58 INFO input.FileInputFormat: Total input paths to process : 10
16/05/18 21:26:58 INFO mapreduce.JobSubmitter: number of splits:10
16/05/18 21:26:58 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1463558073107_0001
16/05/18 21:26:58 INFO impl.YarnClientImpl: Submitted application application_1463558073107_0001
16/05/18 21:26:59 INFO mapreduce.Job: The url to track the job:http://hadoop-1:8088/proxy/application_1463558073107_0001/ 16/05/18 21:26:59 INFO mapreduce.Job: Running job: job_1463558073107_0001
16/05/18 21:27:05 INFO mapreduce.Job: Job job_1463558073107_0001 running in uber mode : false
16/05/18 21:27:05 INFO mapreduce.Job: map 0% reduce 0%
16/05/18 21:27:10 INFO mapreduce.Job: map 10% reduce 0%
16/05/18 21:27:14 INFO mapreduce.Job: map 20% reduce 0%
16/05/18 21:27:15 INFO mapreduce.Job: map 40% reduce 0%
16/05/18 21:27:18 INFO mapreduce.Job: map 50% reduce 0%
16/05/18 21:27:20 INFO mapreduce.Job: map 70% reduce 0%
16/05/18 21:27:22 INFO mapreduce.Job: map 80% reduce 0%
16/05/18 21:27:24 INFO mapreduce.Job: map 100% reduce 0%
16/05/18 21:27:27 INFO mapreduce.Job: map 100% reduce 100%
16/05/18 21:27:27 INFO mapreduce.Job: Job job_1463558073107_0001 completed successfully
16/05/18 21:27:27 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=96
FILE: Number of bytes written=1272025
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2630
HDFS: Number of bytes written=215
HDFS: Number of read operations=43
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Launched map tasks=10
Launched reduce tasks=1
Data-local map tasks=10
Total time spent by all maps in occupied slots (ms)=37617
Total time spent by all reduces in occupied slots (ms)=2866
Total time spent by all map tasks (ms)=37617
Total time spent by all reduce tasks (ms)=2866
Total vcore-seconds taken by all map tasks=37617
Total vcore-seconds taken by all reduce tasks=2866
Total megabyte-seconds taken by all map tasks=38519808
Total megabyte-seconds taken by all reduce tasks=2934784
Map-Reduce Framework
Map input records=10
Map output records=20
Map output bytes=180
Map output materialized bytes=340
Input split bytes=1450
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=340
Reduce input records=20
Reduce output records=0
Spilled Records=40
Shuffled Maps =10
Failed Shuffles=0
Merged Map outputs=10
GC time elapsed (ms)=602
CPU time spent (ms)=12210
Physical memory (bytes) snapshot=4803805184
Virtual memory (bytes) snapshot=15372648448
Total committed heap usage (bytes)=4912578560
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1180
File Output Format Counters
Bytes Written=97
Job Finished in 29.482 seconds
Estimated value of Pi is 3.14800000000000000000
在WebUI界面的
Clusters > Cluster 1 > Activities > YARN Applications
參考資料:
/content/7762557.html
http://itindex.net/detail/51928-cloudera-manager-cdh5