歡迎來到Linux教程網
Linux教程網
Linux教程網
Linux教程網
您现在的位置: Linux教程網 >> UnixLinux >  >> Linux基礎 >> 關於Linux

suse內核BUG一例:update_group_power:cpu_power

suse內核BUG一例:update_group_power:cpu_power   引言: 最近業務服務器有5台都先後在3天內宕機,查出來的原因是suse11sp1版本的內核bug。 系統報錯信息: 系統messages日志報錯如下 Jun 10 14:00:07 sharedbpro kernel: [  282.962529] update_group_power: cpu_power = 3925366004 Jun 10 14:00:07 sharedbpro kernel: [  282.962559] update_group_power: cpu_power = 3925397578 Jun 10 14:00:07 sharedbpro kernel: [  282.965803] update_group_power: cpu_power = 3928638515 Jun 10 14:00:07 sharedbpro kernel: [  282.966201] update_group_power: cpu_power = 3929034454 Jun 10 14:00:07 sharedbpro kernel: [  282.966369] update_group_power: cpu_power = 3929206061 Jun 10 14:00:07 sharedbpro kernel: [  282.966397] update_group_power: cpu_power = 3929235611 Jun 10 14:00:07 sharedbpro kernel: [  282.966507] update_group_power: cpu_power = 3929344069 Jun 10 14:00:07 sharedbpro kernel: [  282.966535] update_group_power: cpu_power = 3929373135 Jun 10 14:00:07 sharedbpro kernel: [  282.969804] update_group_power: cpu_power = 3932639635 Jun 10 14:00:07 sharedbpro kernel: [  282.970188] update_group_power: cpu_power = 3933021527 Jun 10 14:00:07 sharedbpro kernel: [  282.970353] update_group_power: cpu_power = 3933189985 Jun 10 14:00:07 sharedbpro kernel: [  282.970381] update_group_power: cpu_power = 3933218987 Jun 10 14:00:07 sharedbpro kernel: [  282.970490] update_group_power: cpu_power = 3933327365 Jun 10 14:00:07 sharedbpro kernel: [  282.970518] update_group_power: cpu_power = 3933356585 Jun 10 14:00:07 sharedbpro kernel: [  282.973789] update_group_power: cpu_power = 3936624686 Jun 10 14:00:07 sharedbpro kernel: [  282.974194] update_group_power: cpu_power = 3937026810 Jun 10 14:00:07 sharedbpro kernel: [  282.974360] update_group_power: cpu_power = 3937196506 Jun 10 14:00:07 sharedbpro kernel: [  282.974388] update_group_power: cpu_power = 3937226236 Jun 10 14:00:07 sharedbpro kernel: [  282.974496] update_group_power: cpu_power = 3937333589 Jun 10 14:00:07 sharedbpro kernel: [  282.974525] update_group_power: cpu_power = 3937363466 Jun 10 14:00:07 sharedbpro kernel: [  282.977789] update_group_power: cpu_power = 3940624812 Jun 10 14:00:07 sharedbpro kernel: [  282.978185] update_group_power: cpu_power = 3941017715 Jun 10 14:00:07 sharedbpro kernel: [  282.978351] update_group_power: cpu_power = 3941187161   問題現象: 系統日志內出現類似“update_group_power: cpu_power = xxxxxxxx”的報錯,一般報錯時間都會超過10分鐘,且是連續報錯,在日志中看著很是壯觀,滿篇都是。 到達一定的時間之後,系統就會宕機,我第一時間我通過ILO登錄看見控制台顯示是黑屏假死,當時直接重啟系統然後啟動數據庫,觀察一切恢復正常。   解決辦法:   根據廠商判斷,確定此現象為一bug。 解決辦法為更新系統內核到穩定版本sp2或sp1最高版,或更新系統所有文件到sp2版本;   小貼士: 鹵肉在這裡強調一下,我們作為運維的dba應該遵從業務優先,先恢復應用,然後再查問題原因,當然必要的短時間(一兩分鐘內)可以做的信息收集工作還是可以做的。
Copyright © Linux教程網 All Rights Reserved