歡迎來到Linux教程網
Linux教程網
Linux教程網
Linux教程網
您现在的位置: Linux教程網 >> UnixLinux >  >> Linux綜合 >> Linux資訊 >> 更多Linux

監測你的硬盤 - 提前預報系統SMART

  提要:大家心理最怕的不是安裝某個系統,而是辛辛苦苦安裝之後,忽然有一天硬盤壞了,又沒有備份(DAT,DLT之類磁帶機貴得嚇死人)。怎麼樣才能知道你的硬盤能否過新年呢?(硬盤狀態如何?) 特別是如果能夠提前預報,告訴大家硬盤快頂不住了,那該有多好。    本文測試環境FreeBSD和Debian    解決辦法:    SMART  SMART(SFF-8035i) 是硬盤生產商們建立的一個工業標准,這個標准就是在硬盤上保存一個跟執行情況,可靠程度,讀找錯誤率等屬性的表格。所有屬性都有一個1byte(大小范圍1-253)的標准化值,還包含另一個1byte的關鍵階段值,如果屬性表格內某個數據接近小於或達到關鍵階段值,那麼你的硬盤就快跟你永別了,至少也是超過它的設計使用極限了- 該做備份和最壞的打算了。    SFF-8035i工業標准經過ATA-3,ATA-4到了ATA-5,加入了一個錯誤信息文件(error log) 和一系列硬盤自測SMART命令。SMART適應與IDE和SCSI硬盤。    我用FreeBSD 5.2和Debian做了實驗,都不錯,OpenBSD下面可以直接用atactl,大家看看man atactl。其它Linux系統沒問題。    安裝 smartmontools  FreeBSD:    #/usr/ports/sysutils/smartmontools  #make install clean  #cp /usr/local/etc/rc.d/smartd.sh.sample /usr/local/etc/rc.d/smartd.sh  #cp /usr/local/etc/smartd.conf.sample /usr/local/etc/smartd.conf  #chmod 555 /usr/local/etc/rc.d/smartd.sh    Debian:  apt-get install smartmontool*  /etc/smartd.conf    FreeBSD設置文件/usr/local/etc/smartd.conf  Debian設置文件 /etc/smartd.conf    注意:    千萬不要忘了改寫設置文件!!!!  FreeBSD下第一張IDE硬盤是ad0,SCSI硬盤是da0  Debian下第一張IDE硬盤是/dev/hda,SCSI硬盤是/dev/sda    下面我用FreeBSD做例子,我的硬盤是IDE,如果你的是SCSI,你就去官方網站查詢。    啟動監護程序:  /usr/local/etc/rc.d/smartd.sh start    首先讓我們看一下你的硬盤是否支持SMART:    bash-2.05b# smartctl -i /dev/ad0    smartctl version 5.26 Copyright (C) 2002-3 BrUCe Allen  Home page is http://smartmontools.sourceforge.net/   === START OF INFORMATION SECTION ===  Device Model: IBM-DJSA-220  Serial Number: 44K443Z0103  Firmware Version: JS4OAC3A  Device is: Not in smartctl database [for details use: -P showall]  ATA Version is: 5  ATA Standard is: ATA/ATAPI-5 T13 1321D revision 1  Local Time is: Mon Dec 22 21:04:38 2003 CET  SMART support is: Available - device has SMART capability.  SMART support is: enable    The SMART RETURN STATUS return value (smartmontools -H option/Directive)  can not be retrieved with this version of ATAng, please do not rely on this value    看看我的盤健康測試,如果你的self-assessment test result是FAILING,那就是說  它要完蛋了,馬上備份!!!    bash-2.05b# smartctl -Hc /dev/ad0    smartctl version 5.26 Copyright (C) 2002-3 Bruce Allen  Home page is http://smartmontools.sourceforge.net/   The SMART RETURN STATUS return value (smartmontools -H option/Directive)  can not be retrieved with this version of ATAng, please do not rely on  this value  === START OF READ SMART DATA SECTION ===  SMART overall-health self-assessment test result: PASSED    General SMART Values:  Offline data collection status: (0x00) Offline data collection activity  was  never started.  Auto Offline Data Collection: Disabled.    Self-test execution status: ( 0) The previous self-test routine completed  without error or no self-test has  ever  been run.  Total time to complete Offline  data collection: ( 650) seconds.  Offline data collection  capabilities: (0x1b) SMART execute Offline immediate.  Auto Offline data collection on/off  support.  Suspend Offline collection upon  new  command.  Offline surface scan supported.  Self-test supported.  No Conveyance Self-test supported.  No Selective Self-test supported.  SMART capabilities: (0x0003) Saves SMART data before entering  power-saving mode.  Supports SMART auto save timer.  Error logging capability: (0x01) Error logging supported.  No General Purpose Logging support.  Short self-test routine  recommended polling time: ( 2) minutes.  Extended self-test routine  recommended polling time: ( 29) minutes.    下面表格給出的屬性信息根據你的硬盤廠商不同而不同,最 重要的是明白每個縱行的意義:如果有一個標准值(VALUE)小於或等於關鍵值(THRESH)時,WHEN_FAILED 行會給出信息,我WHEN_FAILED縱行是空行,說明沒事兒。如果WHEN_FAILED報錯,硬盤有問題了。。。。WORST 是標准值(VALUE)的最小值。    bash-2.05b# smartctl -A /dev/ad0  smartctl version 5.26 Copyright (C) 2002-3 Bruce Allen  Home page is http://smartmontools.sourceforge.net/   The SMART RETURN STATUS return value (smartmontools -H option/Directive)  can not be retrieved with this version of ATAng, please do not rely on  this value  === START OF READ SMART DATA SECTION ===  SMART Attributes Data Structure revision number: 16  Vendor Specific SMART Attributes with Thresholds:  ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED  WHEN_FAILED RAW_VALUE  1 Raw_Read_Error_Rate 0x000b 100 100 062 Pre-fail Always  - 0  2 Throughput_Performance 0x0005 100 100 040 Pre-fail Offline  - 0  3 Spin_Up_Time 0x0007 113 113 033 Pre-fail Always  - 1  4 Start_Stop_Count 0x0012 100 100 000 Old_age Always  - 985  5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always  - 0  7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always  - 0  8 Seek_Time_Performance 0x0005 100 100 040 Pre-fail Offline  - 0  9 Power_On_Hours 0x0012 097 097 000 Old_age Always  - 1642  10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always  - 0  12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always  - 914  191 G-Sense_Error_Rate 0x000a 100 100 000 Old_age Always  - 0  192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always  - 8  193 Load_Cycle_Count 0x0012 096 096 050 Old_age Always  - 45262  196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always  - 17  197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always  - 1  198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline  - 0  199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always  - 0    下面命令給出硬盤歷史錯誤信息(error log),因為篇幅關系我就不給出了。  smartctl -l error /dev/ad0    下面命令給出硬盤自測    smartctl -l selftest /dev/ad0    終止硬盤自測。    smartctl -X /dev/ad0    建議:改寫設置文件smartd.conf,有一個“-m”的選項非常有用,它可以把信息用mail發給你。    編輯後記:  SMART 可以提醒你,但不能幫你做備份。要真正的讓SMART為你服務,應該好好讀寫smartd & smartd.conf 的注釋, 讓其後台程序在一定情況下提醒你(mail)有些關鍵值達到了危險區域, 以上給出的幾個命令是在你開始感到情況不妙的時候進行的手工測試。




Copyright © Linux教程網 All Rights Reserved