您现在的位置： Linux教程網 >> UnixLinux > >> Linux基礎 >> 關於Linux

Linux中斷處理之時鐘中斷（一）

一:前言

時鐘是整個操作系統的脈搏,它為進程的時間片調度,定時事件提供了依據.另外,用戶空間的很多操作都依賴於時鐘,例如select.poll,make.操作系統管理的時間為分兩種,一種稱為當前時間,也即我們日常生活所用的時間.這個時間一般保存在CMOS中.主板中有特定的芯片為其提供計時依據.另外一種時間稱為相對時間.例如系統運行時間.顯然對計算機而然,相對時間比當前時間更為重要.

二:與時鐘有關的硬件處理.

1):實時時鐘(RTC)

該時鐘獨立於CPU和其它芯片.即使PC斷電,該時鐘還是繼續運行.該計時由一塊單獨的芯片處理,並把時鐘值存放CMOS.該時間可參在IRQ8上周期性的產生時間信號.頻率在2Hz ~ 8192Hz之間.但在linux中,只是用RTC來獲取當前時間.

2):時間戳計時器(TSC)

CPU附帶了一個64位的時間戳寄存器,當時鐘信號到來的時候.該寄存器內容自動加1

3):可編程中斷定時器(PIC)

該設備可以周期性的發送一個時間中斷信號.發送中斷信號的間隔可以對其進行編程控制.在linux系統中,該中斷時間間隔由HZ表示.這個時間間隔也被稱為一個節拍(tick).

4):CPU本地定時器

在處理器的本地APIC還提供了另外的一定定時設備.CPU本地定時器也可以單次或者周期性的產生中斷信號.與上次描述的PIC相比.它有以下幾點的區別:

APIC本地計時器是32位.而PIC是16位.由此APIC本地計時器可以提供更低頻率的中斷信號

本地APIC只把中斷信號發送給本地CPU.而PIC發送的中斷信號任何CPU都可以處理

APIC定時器是基於總線時鐘信號的.而PIC有自己的內部時鐘振蕩器

5):高精度計時器(HPET)

在linux2.6中增加了對HPET的支持.HPET是一種由微軟和intel聯合開發的新型定時芯片.該設備有一組寄時器,每個寄時器對應有自己的時鐘信號,時鐘信號到來的時候就會自動加1.

實際上,在intel多理器系統與單處理器系統還有所不同:

在單處理系統中.所有計時活動過由PIC產生的時鐘中斷信號觸發的

在多處理系統中,所有普通活動是由PIC產生的中斷觸發.所有具體的CPU活動,都由本地APIC觸發的.

三:時鐘中斷相關代碼分析

time_init()是時鐘初始化函數,他由asmlinkage void __init start_kernel()調用.具體代碼如下:

//時鐘中斷初始化 void __init time_init(void) { //如果定義了HPET #ifdef CONFIG_HPET_TIMER 　　 if (is_hpet_capable()) { 　　　　 /* 　　　　 * HPET initialization needs to do memory-mapped io. So, let 　　　　 * us do a late initialization after mem_init(). 　　　　 */ 　　　　 late_time_init = hpet_time_init; 　　　　 return; 　　 } #endif 　　 //從cmos 中取得實時時間　　 xtime.tv_sec = get_cmos_time(); 　　 //初始化wall_to_monotonic 　　 wall_to_monotonic.tv_sec = -xtime.tv_sec; 　　 xtime.tv_nsec = (INITIAL_JIFFIES % HZ) * (NSEC_PER_SEC / HZ); 　　 wall_to_monotonic.tv_nsec = -xtime.tv_nsec; 　　 //選擇一個合適的定時器　　 cur_timer = select_timer(); 　　 printk(KERN_INFO "Using %s for high-res timesource\n",cur_timer->name); 　　 //注冊時間中斷信號處理函數　　 time_init_hook(); }

該函數從cmos取得了當前時間.並為調整時間精度選擇了合適的定時器

轉入time_init_hook():

void __init time_init_hook(void) { 　　 //注冊中斷處理函數　　 setup_irq(0, &irq0); }

Irq0定義如下:

static struct irqaction irq0　= { timer_interrupt, SA_INTERRUPT, CPU_MASK_NONE, "timer", NULL, NULL};

對應的中斷處理函數為:timer_interrupt():

irqreturn_t timer_interrupt(int irq, void *dev_id, struct pt_regs *regs) { 　　 //因為該函數會修改xtime的值,為避免多處理器競爭.先加鎖　　 write_seqlock(&xtime_lock); 　　 //記錄上一次時間中斷的精確時間.做調准時鐘用　　 cur_timer->mark_offset(); 　　 do_timer_interrupt(irq, NULL, regs); 　　 //解鎖　　 write_sequnlock(&xtime_lock); 　　 return IRQ_HANDLED; } 核心處理函數為 do_timer_interrupt(): static inline void do_timer_interrupt(int irq, void *dev_id, 　　　　　　　　　　　 struct pt_regs *regs) { #ifdef CONFIG_X86_IO_APIC 　　 if (timer_ack) { 　　　　 spin_lock(&i8259A_lock); 　　　　 outb(0x0c, PIC_MASTER_OCW3); 　　　　 /* Ack the IRQ; AEOI will end it automatically. */ 　　　　 inb(PIC_MASTER_POLL); 　　　　 spin_unlock(&i8259A_lock); 　　 } #endif 　　 do_timer_interrupt_hook(regs); 　　 //如果要進行時間同步,那就隔一段時間把當前時間寫回coms 　　 if ((time_status & STA_UNSYNC) == 0 && 　　　　 xtime.tv_sec > last_rtc_update + 660 && 　　　　(xtime.tv_nsec / 1000) 　　　　　　　 >= USEC_AFTER - ((unsigned) TICK_SIZE) / 2 && 　　　　 (xtime.tv_nsec / 1000) 　　　　　　　 <= USEC_BEFORE + ((unsigned) TICK_SIZE) / 2) { 　　　　 /* horrible...FIXME */ 　　　　 if (efi_enabled) { 　　　　　　 if (efi_set_rtc_mmss(xtime.tv_sec) == 0) 　　　　　　　　　 last_rtc_update = xtime.tv_sec; 　　　　　　　 else 　　　　　　　　　 last_rtc_update = xtime.tv_sec - 600; 　　　　 } else if (set_rtc_mmss(xtime.tv_sec) == 0) 　　　　　　　 last_rtc_update = xtime.tv_sec; 　　　　 else 　　　　　　　 last_rtc_update = xtime.tv_sec - 600; /* do it again in 60 s */ 　　 } #ifdef CONFIG_MCA 　　 if( MCA_bus ) { 　　　　 /* The PS/2 uses level-triggered interrupts.　You can't 　　　　 turn them off, nor would you want to (any attempt to 　　　　 enable edge-triggered interrupts usually gets intercepted by a 　　　　 special hardware circuit).　Hence we have to acknowledge 　　　　 the timer interrupt.　Through some incredibly stupid 　　　　 design idea, the reset for IRQ 0 is done by setting the 　　　　 high bit of the PPI port B (0x61).　Note that some PS/2s, 　　　　 notably the 55SX, work fine if this is removed.　*/ 　　　　 irq = inb_p( 0x61 );　 /* read the current state */ 　　　　 outb_p( irq|0x80, 0x61 );　 /* reset the IRQ */ 　　 } #endif }

我們忽略選擇編譯部份,轉到do_timer_interrupt_hook()

static inline void do_timer_interrupt_hook(struct pt_regs *regs) { 　　 do_timer(regs); /* * In the SMP case we use the local APIC timer interrupt to do the * profiling, except when we simulate SMP mode on a uniprocessor * system, in that case we have to call the local interrupt handler. */ #ifndef CONFIG_X86_LOCAL_APIC 　　 //更新內核代碼監管器。在每次時鐘中斷的時候。取得每一次中斷前的esp，進而可以得到運行的函//數地址。這樣就可以統計運行時間最長的函內核函數區域。以便於內核管理者優化　　 profile_tick(CPU_PROFILING, regs); #else 　　 if (!using_apic_timer) 　　　　 smp_local_timer_interrupt(regs); #endif }

這裡有幾個重要的操作.先看do_timer():

void do_timer(struct pt_regs *regs) { 　　 // 更新jiffies計數.jiffies_64與jiffies在鏈接的時候,實際是指向同一個區域　　 jiffies_64++; #ifndef CONFIG_SMP 　　 /* SMP process accounting uses the local APIC timer */ 　　 //更新當前運行進程的與時鐘相關的信息　　 update_process_times(user_mode(regs)); #endif 　　 //更新當前時間.xtime的更新　　 update_times(); }

Update_process_times（）代碼如下:

void update_process_times(int user_tick) { 　　 struct task_struct *p = current; 　　 int cpu = smp_processor_id(), system = user_tick ^ 1; 　　 update_one_process(p, user_tick, system, cpu); 　　 //激活時間軟中斷　　 run_local_timers(); 　　 //減少時間片。這個函數涉及到的東西過多，等到進程調度的時候再來分析。請關注本站更新*^_^* 　　 scheduler_tick(user_tick, system); }

先看update_one_process():

static void update_one_process(struct task_struct *p, unsigned long user, 　　　　　　　unsigned long system, int cpu) { 　　 do_process_times(p, user, system); 　　 //檢查進程的定時器　　 do_it_virt(p, user); 　　 do_it_prof(p); }　　在這裡簡單介紹一下do_it_virt()與do_it_prof():

這兩個函數主要檢查用戶空間的進程定時器是否到期.在進程的內存描述符有相關的字段.如下:

struct task_struct｛ ⋯⋯ unsigned long it_real_value, it_prof_value,it_virt_value; unsigned long it_real_incr, it_prof_incr, it_virt_incr; struct timer_list real_timer; ⋯⋯ }

（1）真實間隔定時器（ITIMER_REAL）：這種間隔定時器在啟動後，不管進程是否運行，每個時鐘滴答都將其間隔計數器減1。當減到0值時，內核向進程發送SIGALRM信號。結構類型task_struct中的成員it_real_incr則表示真實間隔定時器的間隔計數器的初始值，而成員it_real_value則表示真實間隔定時器的間隔計數器的當前值。由於這種間隔定時器本質上與上一節的內核定時器時一樣的，因此Linux實際上是通過real_timer這個內嵌在task_struct結構中的內核動態定時器來實現真實間隔定時器ITIMER_REAL的。

（2）虛擬間隔定時器ITIMER_VIRT：也稱為進程的用戶態間隔定時器。結構類型task_struct中成員it_virt_incr和it_virt_value分別表示虛擬間隔定時器的間隔計數器的初始值和當前值，二者均以時鐘滴答次數位計數單位。當虛擬間隔定時器啟動後，只有當進程在用戶態下運行時，一次時鐘滴答才能使間隔計數器當前值it_virt_value減1。當減到0值時，內核向進程發送SIGVTALRM信號（虛擬鬧鐘信號），並將it_virt_value重置為初值it_virt_incr。具體請見7.4.3節中的do_it_virt()函數的實現。

（3）PROF間隔定時器ITIMER_PROF：進程的task_struct結構中的it_prof_value和it_prof_incr成員分別表示PROF間隔定時器的間隔計數器的當前值和初始值（均以時鐘滴答為單位）。當一個進程的PROF間隔定時器啟動後，則只要該進程處於運行中，而不管是在用戶態或核心態下執行，每個時鐘滴答都使間隔計數器it_prof_value值減1。當減到0值時，內核向進程發送SIGPROF信號，並將it_prof_value重置為初值it_prof_incr.

Do_process_times(): static inline void do_process_times(struct task_struct *p, 　　 unsigned long user, unsigned long system) { 　　 unsigned long psecs; 　　 //p->utime:在用戶空間所花的時間　　 psecs = (p->utime += user); 　　 //p->stime:在系統空間所花的時間　　 psecs += (p->stime += system); 　　 //如果運行的時間片到達　　 if (psecs / HZ >= p->rlim[RLIMIT_CPU].rlim_cur) { 　　　　 /* Send SIGXCPU every second.. */ 　　　　 //每秒發送一個SIGXCPU 　　　　 if (!(psecs % HZ)) 　　　　　　　 send_sig(SIGXCPU, p, 1); 　　　　 /* and SIGKILL when we go over max.. */ 　　　　 //發送SIGKILL 　　　　 if (psecs / HZ >= p->rlim[RLIMIT_CPU].rlim_max) 　　　　　　　 send_sig(SIGKILL, p, 1); 　　 } }

該函數檢查當前進程的時間片是否到達,如果到達就給當前進程發送SIGKILL和SIGXCPU

do_it_virt()/do_it_prof()檢查過程的定時器是否到期.如果到期就給進程發送相應的信號:

static inline void do_it_virt(struct task_struct * p, unsigned long ticks) { 　　 unsigned long it_virt = p->it_virt_value; 　　 if (it_virt) { 　　　　 it_virt -= ticks; 　　　　 if (!it_virt) { 　　　　　　　 it_virt = p->it_virt_incr; 　　　　　　　 //發送SIGVTALRM 　　　　　　　 send_sig(SIGVTALRM, p, 1); 　　　　 } 　　　　 p->it_virt_value = it_virt; 　　 } }

返回到update_process_times()的其它函數:

run_local_timers() void run_local_timers(void) { 　　 raise_softirq(TIMER_SOFTIRQ); }

激活時間軟中斷.這個函數我們在IRQ中斷中已經分析過了,不再贅述

我們在do_timer（）還漏掉了一個函數：

static inline void update_times(void) { 　　　　 unsigned long ticks; 　　　　 //wall_jiffies:上一次更新的值　　　　 ticks = jiffies - wall_jiffies; 　　　　 if (ticks) { 　　　　　　　　　 wall_jiffies += ticks; 　　　　　　　　　 //更新xtime 　　　　　　　　　 update_wall_time(ticks); 　　　　 } 　　　　 //統計TASK_RUNNING TASK_UNINTERRUPTIBLE進程數量　　　　 calc_load(ticks); }

四：定時器

在模塊的編寫過程中，我們經常使用定時器來等待一段時間之後再來執行某一個操作。為方便分析，寫了下列一段測試程序：

#include <linux/config.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/interrupt.h> #include <linux/delay.h> #include <linux/timer.h> MODULE_LICENSE("GPL"); void test_timerfuc(unsigned long x) { 　　　　 printk("Eric xiao test ......\n"); } //聲明一個定個器 struct timer_list test_timer = TIMER_INITIALIZER(test_timerfuc, 0, 0); int kernel_test_init() { 　　　　 printk("test_init\n"); 　　　　 //修改定時器到期時間。為3個HZ。一個HZ產生一個時鐘中斷　　　　 mod_timer(&test_timer,jiffies+3*HZ); 　　　　 //把定時器加入時鐘軟中斷處理鏈表　　　　 add_timer(&test_timer); } int kernel_test_exit() { 　　　　 printk("test_exit\n"); 　　　　 return 0; } module_init(kernel_test_init); module_exit(kernel_test_exit);

上面的例子程序比較簡單，我們從這個例子開始研究linux下的定時器實現。

TIMER_INITIALIZER（）：

1):TIMER_INITIALIZER（）用來聲明一個定時器，它的定義如下：

#define TIMER_INITIALIZER(_function, _expires, _data) {             \
                   .function = (_function),                            \
                   .expires = (_expires),                                \
                   .data = (_data),                                \
                   .base = NULL,                                         \
                   .magic = TIMER_MAGIC,                              \
                   .lock = SPIN_LOCK_UNLOCKED,                         \
         }

Struct timer_list定義如下：

struct timer_list { 　　　　 //用來形成鏈表　　　　 struct list_head entry; 　　　　 //定始器到達時間　　　　 unsigned long expires; 　　　　 spinlock_t lock; 　　　　 unsigned long magic; 　　　　 //定時器時間到達後，所要運行的函數　　　　 void (*function)(unsigned long); 　　　　 //定時器函數對應的參數　　　　 unsigned long data; 　　　　 //掛載這個定時器的tvec_t_base_s.這個結構我們等會會看到 struct tvec_t_base_s *base;　 };

從上面的過程中我們可以看到TIMER_INITIALIZER（）只是根據傳入的參數初始化了struct timer_list結構.並把magic 成員初始化成TIMER_MAGIC

2): mod_timer():修改定時器的到時時間

int mod_timer(struct timer_list *timer, unsigned long expires) { 　　　　 //如果該定時器沒有定義fuction 　　　　 BUG_ON(!timer->function); 　　　　 //判斷timer的magic是否為TIMER_MAGIC.如果不是,則將其修正為TIMER_MAGIC 　　　　 check_timer(timer); 　　　　 //如果要調整的時間就是定時器的定時時間而且已經被激活,則直接返回　　　　 if (timer->expires == expires && timer_pending(timer)) 　　　　　　　　　 return 1; 　　　　 //調用_mod_timer().呆會再給出分析　　　　 return __mod_timer(timer, expires); }

3): add_timer()用來將定時器掛載到定時軟中斷隊列,激活該定時器

static inline void add_timer(struct timer_list * timer) { 　　　　 __mod_timer(timer, timer->expires); }

可以看到mod_timer與add_timer 最後都會調用__mod_timer().為了分析這個函數,我們先來了解一下定時系統相關的數據結構.

tvec_bases: per cpu變量,它的定義如下:

static DEFINE_PER_CPU(tvec_base_t, tvec_bases) = { SPIN_LOCK_UNLOCKED };

由此可以看到tves_bases的數型數據為teves_base_t.數據結構的定義如下:

typedef struct tvec_t_base_s tvec_base_t;

struct tvec_t_base_s的定義:

struct tvec_t_base_s { 　　　　 spinlock_t lock; 　　　　 //上一次運行計時器的jiffies 值　　　　 unsigned long timer_jiffies; 　　　　 struct timer_list *running_timer; 　　　　 //tv1 tv2 tv3 tv4 tv5是五個鏈表數組　　　　 tvec_root_t tv1; 　　　　 tvec_t tv2; 　　　　 tvec_t tv3; 　　　　 tvec_t tv4; 　　　　 tvec_t tv5; } ____cacheline_aligned_in_smp; Tves_root_t與tvec_t的定義如下: #define TVN_BITS 6 #define TVR_BITS 8 #define TVN_SIZE (1 << TVN_BITS) #define TVR_SIZE (1 << TVR_BITS) #define TVN_MASK (TVN_SIZE - 1) #define TVR_MASK (TVR_SIZE - 1) typedef struct tvec_s { 　　　　 struct list_head vec[TVN_SIZE]; } tvec_t; typedef struct tvec_root_s { 　　　　 struct list_head vec[TVR_SIZE]; } tvec_root_t;

系統規定定時器最大超時時間間隔為0xFFFFFFFF.即為一個32位數.即使在64位系統上.如果超過此值也會將其強制設這oxFFFFFFFF(這在後面的代碼分析中可以看到).內核最關心的就是間隔在0~255個HZ之間的定時器.次重要的是間隔在255~1<<(8+6)之間的定時器.第三重要的是間隔在1<<(8+6) ~ 1<<(8+6+6)之間的定器.依次往下推.也就是把32位的定時間隔為份了五個部份.1個8位.4個6位.所以內核定義了五個鏈表數組.第一個鏈表數組大小為8位大小,也即上面定義的 #define TVR_SIZE (1 << TVR_BITS).其它的四個數組大小為6位大小.即上面定義的#define TVN_SIZE (1 << TVN_BITS)

在加入定時器的時候,按照時間間隔把定時器加入到相應的數組即可.了解這點之後,就可以來看__mod_timer()的代碼了:

//修改timer或者新增一個timer都會調用此接口

int __mod_timer(struct timer_list *timer, unsigned long expires) { 　　　　 tvec_base_t *old_base, *new_base; 　　　　 unsigned long flags; 　　　　 int ret = 0; //入口參數檢測　　　　 BUG_ON(!timer->function); 　　　　 check_timer(timer); 　　　　 spin_lock_irqsave(&timer->lock, flags); 　　　　 //取得當前CPU對應的tvec_bases 　　　　 new_base = &__get_cpu_var(tvec_bases); repeat: 　　　　 //該定時器所在的tvec_bases.對於新增的timer.它的base字段為NULL 　　　　 old_base = timer->base; 　　　　 /* 　　　　 * Prevent deadlocks via ordering by old_base < new_base. 　　　　 */ 　　　　 //在把timer從當前tvec_bases摘下來之前,要充分考慮好競爭的情況　　　　 if (old_base && (new_base != old_base)) { 　　　　　　　　　 //按次序獲得鎖　　　　　　　　　 if (old_base < new_base) { 　　　　　　　　　　　　　　 spin_lock(&new_base->lock); 　　　　　　　　　　　　　　 spin_lock(&old_base->lock); 　　　　　　　　　 } else { 　　　　　　　　　　　　　　 spin_lock(&old_base->lock); 　　　　　　　　　　　　　　 spin_lock(&new_base->lock); 　　　　　　　　　 } 　　　　　　　　　 /* 　　　　　　　　　 * The timer base might have been cancelled while we were 　　　　　　　　　 * trying to take the lock(s): 　　　　　　　　　 */ 　　　　　　　　　 //如果timer->base != old_base.那就是說在Lock的時候.其它CPU更改它的值　　　　　　　　　 //那就解鎖.重新判斷　　　　　　　　　 if (timer->base != old_base) { 　　　　　　　　　　　　　　 spin_unlock(&new_base->lock); 　　　　　　　　　　　　　　 spin_unlock(&old_base->lock); 　　　　　　　　　　　　　　 goto repeat; 　　　　　　　　　 } 　　　　 } else { 　　　　　　　　　 //old_base == NULl 或者是 new_base==old_base的情況　　　　　　　　　 //獲得鎖　　　　　　　　　 spin_lock(&new_base->lock); 　　　　　　　　　 //同理,在Lock的時候timer會生了改變　　　　　　　　　 if (timer->base != old_base) { 　　　　　　　　　　　　　　 spin_unlock(&new_base->lock); 　　　　　　　　　　　　　　 goto repeat; 　　　　　　　　　 } 　　　　 } 　　　　 /* 　　　　 * Delete the previous timeout (if there was any), and install 　　　　 * the new one: 　　　　 */ 　　　　 //將其從其它的tvec_bases上刪除.注意運行到這裡的話,說話已經被Lock了　　　　 if (old_base) { 　　　　　　　　　 list_del(&timer->entry); 　　　　　　　　　 ret = 1; 　　　　 } 　　　　 //修改它的定時器到達時間　　　　 timer->expires = expires; 　　　　 //將其添加到new_base中　　　　 internal_add_timer(new_base, timer); 　　　　 //修改base字段　　　　 timer-base = new_base; //操作完了,解鎖 if (old_base && (new_base != old_base)) spin_unlock(&old_base->lock); spin_unlock(&new_base->lock); spin_unlock_irqrestore(&timer->lock, flags); return ret; }

上一篇文章： Linux中斷處理之時鐘中斷（二）
下一篇文章： Linux下acpid工作原理

關於Linux

linux bible 第七章中斷及中斷處理