一.配置硬件集群 1.最低硬件配置 至少400MB的/tmp空間 至少512MB的物理內存 3倍於物理內存的swap空間(大於1g物理內存時可為2倍) 我想硬盤空間大家不需要太節省,畢竟數據文件是放陣列的,本地硬盤的分區盡量開的大一點吧. 光纖模塊,光纖交換機,光纖線(主機與陣列建推薦用光纖,如果用千兆的6類網線的話最大速度也只能 達到30多M每秒, 由於陣列提供了近100M每秒的讀速度,使用千兆網線會使網絡傳輸會成為瓶頸.) 2.需要的軟件 我這裡是redhat 3.0,當然2.1也可以,不過還是推薦3.0,畢竟內核比較新, 不知道2.6的內核能裝9204rac否,到時再試一把. 另外需要看看rsh的服務包有沒裝上 rpm -q rsh-server rsh-server-0.17-17 如果沒有,裝上rsh,這個是創建rac必須要有的. 3.補丁 操作系統的補丁盡量打到最新吧,特別是2.1的版本,努力往後打. 4.安裝陣列,我這裡是netapp的陣列,通過創建卷後在Linux client mount nfs實現. BTW,NETAPP的管理相當便捷.這裡配置陣列的ip為10.0.29.152.當然你也可以用emc等陣列, 那就變成raw device上建rac,脫離本文的范疇了. 5.編輯兩個節點的ip和/etc/hosts文件 10.0.29.150 wanghai1 192.168.0.150 wanghai1-eth1 10.0.29.152 FAS250 10.0.29.151 wanghai2 192.168.0.151 wanghai2-eth1 6.調整內核網絡參數 由於rac cache fusion機制,我們必須調整內核網絡參數. Parameter Meaning Value /proc/sys/net/core/rmem_default The default setting in bytes of the socket receive buffer 262144 /proc/sys/net/core/rmem_max The maximum socket receive buffer size in bytes 262144 /proc/sys/net/core/wmem_default The default setting in bytes of the socket send buffer 262144 /proc/sys/net/core/wmem_max The maximum socket send buffer size in bytes 262144 調整方法為 $ echo 262144 > /proc/sys/net/core/rmem_default 7.配置/etc/fstab來掛載nfs filesystem 這裡貼出netapp nfs mount參數 10.0.29.152:/vol/vol1/fas250 /netapp nfs rw,hard,nointr,tcp,noac,vers=3,timeo=600,rsize=32768,wsize=32768 8.配置rsh,rlogin,rcp等服務,使用/usr/sbin/ntsysv選擇rsh,rlogin,rcp, 用/sbin/chkconfig --listgrep on看看rsh等服務有沒啟動,如果沒有,運行/sbin/service xinetd start. 編輯/home/Oracle/.rhost wanghai1 oracle wanghai2 oracle wanghai1-eth1 oracle wanghai2-eth1 oracle 並測試rsh [oracle@wanghai2 oracle]$ rsh wanghai1 pwd /home/oracle [oracle@wanghai1 oracle]$ rsh wanghai2 pwd /home/oracle 9.檢查有沒開啟nfs,nfslock的服務,如果沒有開啟nfslock的話在啟動instance的時候會報不能lock控 制文件的錯誤. 另外注意如果有iptables的服務關掉它,防火牆會給rsh帶來麻煩,當然如果你能配置iptables讓rsh通過 就ok了. 創建nfs的mount point,mkdir /netapp 10.在nfs上建立共享quorum文件用於記錄兩節點的active信息 toUCh /netapp/SharedConfigFile touch /netapp/CmDiskFile 11.檢查hangcheck_timer模塊有沒被加載,2.4.20以上內核應該包括了hangcheck,如果是2.4.9的內核 可以去metalink下patch.查看hangcheck是否已加載可以用lsmod,如果沒發現就insmod.
二.安裝OCM 1.創建oinstall組,oracle用戶,創建oracle主目錄,創建profile文件 Creating Oracle User Accounts su - root groupadd oinstall # group owner of Oracle files useradd -c "Oracle software owner" -g oinstall oracle passwd oracle Creating Oracle Directories In this example, make sure that the /opt filesystem is large enough, see Oracle Disk Space for more information. If /opt is not on a separate filesystem, then make sure the root filesystem "/" has enough space. su - root mkdir /opt/oracle mkdir /opt/oracle/product mkdir /opt/oracle/product/9.2 chown -R oracle.oinstall /opt/oracle mkdir /var/opt/oracle chown oracle.oinstall /var/opt/oracle chmod 755 /var/opt/oracle Setting Oracle Environments Set the following Oracle environment variables before you start runInstaller. As the oracle user execute the following commands: # Set the LD_ASSUME_KERNEL environment variable only for Red Hat 9 and # for Red Hat Enterprise Linux Advanced Server 3 (RHEL AS 3) !! # Use the "Linuxthreads with floating stacks" implementation instead of NPTL: eXPort LD_ASSUME_KERNEL=2.4.1 # Oracle Environment export ORACLE_BASE=/opt/oracle export ORACLE_HOME=/opt/oracle/product/9.2 export ORACLE_SID=test1 export ORACLE_TERM=xterm # export TNS_ADMIN= Set if sqlnet.ora, tnsnames.ora, etc. are not in $ORACLE_HOME/network/admin export NLS_LANG=AMERICAN; export ORA_NLS33=$ORACLE_HOME/ocommon/nls/admin/data LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib export LD_LIBRARY_PATH # Set shell search paths export PATH=$PATH:$ORACLE_HOME/bin I successfully installed Oracle9iR2 without setting the following CLASSPATH environment variable: # CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib # CLASSPATH=$CLASSPATH:$ORACLE_HOME/network/jlib # export CLASSPATH 2.運行runInstaller,選擇安裝9201,去掉所有組件,只安裝Java環境和Oracle Universal Installer . 退出,再運行runInstaller,選擇安裝ocm.再退出,再運行runInstaller,選擇升級包,升級ocm到9204 (這裡多次退出重新運行runInstaller是為了防止Oracle Universal Installer出錯) 3.修改$ORACLE_HOME/oracm/admin/cmcfg.ora 文件,把包含watchdog的行都注釋掉,因為9204的rac 已經用hangcheck來監控節點的信息了.加上一行KernelModuleName=hangcheck-timer,修改miscount=210 節點1的cmcfg.ora 文件 HeartBeat=15000 ClusterName=Oracle Cluster Manager, version 9i PollInterval=1000 MissCount=210 PrivateNodeNames=wanghai1-eth1 wanghai2-eth1 PublicNodeNames=wanghai1 wanghai2 ServicePort=9998 #WatchdogSafetyMargin=5000 #WatchdogTimerMargin=60000 CmDiskFile=/netapp/CmDiskFile HostName=wanghai1-eth1 KernelModuleName=hangcheck-timer 節點2的cmcfg.ora 文件 HeartBeat=15000 ClusterName=Oracle Cluster Manager, version 9i PollInterval=1000 MissCount=210 PrivateNodeNames=wanghai1-eth1 wanghai2-eth1 PublicNodeNames=wanghai1 wanghai2 ServicePort=9998 #WatchdogSafetyMargin=5000 #WatchdogTimerMargin=60000 CmDiskFile=/netapp/CmDiskFile HostName=wanghai2-eth1 KernelModuleName=hangcheck-timer 注釋$ORACLE_HOME/oracm/admin/ocmargs.ora中包含watchdogd的行 more $ORACLE_HOME/oracm/admin/ocmargs.ora # Sample configuration file $ORACLE_HOME/oracm/admin/ocmargs.ora #watchdogd oracm norestart 1800 注釋$ORACLE_HOME/oracm/bin/ocmstart.sh中的以下行 # watchdogd's default log file # WATCHDOGD_LOG_FILE=$ORACLE_HOME/oracm/log/wdd.log # watchdogd's default backup file # WATCHDOGD_BAK_FILE=$ORACLE_HOME/oracm/log/wdd.log.bak # Get arguments # watchdogd_args=`grep '^watchdogd' $OCMARGS_FILE # sed -e 's+^watchdogd *++'` # Check watchdogd's existance # if watchdogd status grep 'Watchdog daemon active' >/dev/null # then # echo 'ocmstart.sh: Error: watchdogd is already running' # exit 1 # fi # Backup the old watchdogd log # if test -r