1. 首先pthread_cond_wait 的定義是這樣的
The pthread_cond_wait() and pthread_cond_timedwait() functions are used to block on a condition variable. They are called with mutex locked by the calling thread or undefined behaviour will result.
These functions atomically release mutex and cause the calling thread to block on the condition variable cond;atomically here means "atomically with respect to access by anotherthread to the mutex and then the condition variable". That is, ifanother thread is able to acquire the mutex after the about-to-blockthread has released it, then a subsequent call topthread_cond_signal() or pthread_cond_broadcast() in that thread behaves as if it were issued after the about-to-block thread has blocked.
2. 由上解釋可以看出,pthread_cond_wait() 必須與pthread_mutex 配套使用。(wait的內部操作:一進入wait狀態就unclock,在wait結束前lock)
pthread_cond_wait()函數一進入wait狀態就會自動release mutex.
In Thread1:
pthread_mutex_lock(&m_mutex);
pthread_cond_wait(&m_cond,&m_mutex);
pthread_mutex_unlock(&m_mutex);
In Thread2:
pthread_mutex_lock(&m_mutex);
pthread_cond_signal(&m_cond);
pthread_mutex_unlock(&m_mutex);
為什麼要與pthread_mutex 一起使用呢?這是為了應對線程1在調用pthread_cond_wait()但線程1還沒有進入wait cond的狀態的時候,此時線程2調用了cond_singal 的情況。 如果不用mutex鎖的話,這個cond_singal就丟失了。加了鎖的情況是,線程2必須等到 mutex被釋放(也就是 pthread_cod_wait() 進入wait_cond狀態 並自動釋放mutex) 的時候才能調用cond_singal(前提:線程2也使用mutex)。
3. pthread_cond_wait() 一旦wait成功獲得cond 條件的時候會自動 lock mutex.
這就會出現另一個問題。這是因為
The pthread_cond_wait() and pthread_cond_timedwait() is a cancellation point.
In Thread3:
pthread_cancel(&m_thread);
pthread_join();
因為pthread_cond_wait() and pthread_cond_timedwait() 是線程退出點函數,因此在Thread3中
可以調用pthread_cancel()來退出線程1。那樣顯然線程1會在pthread_cond_wait(&m_cond,&m_mutex); 和pthread_mutex_unlock(&m_mutex); 之間退出, pthread_cond_wait()函數返回後自動lock住了mutex,這個時候線程1退出(並沒有運行到pthread_mutex_unlock()),如果Thread2這個時候就再也得不到lock狀態了。
通常解決這個問題的辦法如下
void cleanup(void *arg)
{
pthread_mutex_unlock(&mutex);
}
void* thread1(void* arg)
{
pthread_cleanup_push(cleanup, NULL); // thread cleanup handler
pthread_mutex_lock(&mutex);
pthread_cond_wait(&cond, &mutex);
pthread_mutex_unlock(&mutex);
pthread_cleanup_pop(0);
}
該方法也可用於其它可能異常終止或退出的線程。
LINUX環境下多線程編程肯定會遇到需要條件變量的情況,此時必然要使用pthread_cond_wait()函數。但這個函數的執行過程比較難於理解。
pthread_cond_wait()的工作流程如下(以MAN中的EXAMPLE為例):
Consider two shared variables x and y, protected by the mutex mut, and a condition vari-
able cond that is to be signaled whenever x becomes greater than y.
int x,y;
pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
Waiting until x is greater than y is performed as follows:
pthread_mutex_lock(&mut);
while (x <= y) {
pthread_cond_wait(&cond, &mut);
}
/* operate on x and y */
pthread_mutex_unlock(&mut);
Modifications on x and y that may cause x to become greater than y should signal the con-
dition if needed:
pthread_mutex_lock(&mut);
/* modify x and y */
if (x > y) pthread_cond_broadcast(&cond);
pthread_mutex_unlock(&mut);
這個例子的意思是,兩個線程要修改X和Y的值,第一個線程當X<=Y時就掛起,直到X>Y時才繼續執行(由第二個線程可能會修改X,Y的值,當X>Y時喚醒第一個線程),即首先初始化一個普通互斥量mut和一個條件變量cond。之後分別在兩個線程中分別執行如下函數體:
pthread_mutex_lock(&mut);
while (x <= y) {
pthread_cond_wait(&cond, &mut);
}
/* operate on x and y */
pthread_mutex_unlock(&mut);
和: pthread_mutex_lock(&mut);
/* modify x and y */
if (x > y) pthread_cond_signal(&cond);
pthread_mutex_unlock(&mut);
其實函數的執行過程非常簡單,在第一個線程執行到pthread_cond_wait(&cond,&mut)時,此時如果X<=Y,則此函數就將mut互斥量解鎖,再將cond條件變量加鎖,此時第一個線程掛起(不占用任何CPU周期)。
而在第二個線程中,本來因為mut被第一個線程鎖住而阻塞,此時因為mut已經釋放,所以可以獲得鎖mut,並且進行修改X和Y的值,在修改之後,一個IF語句判定是不是X>Y,如果是,則此時pthread_cond_signal()函數會喚醒第一個線程,並在下一句中釋放互斥量mut。然後第一個線程開始從pthread_cond_wait()執行,首先要再次鎖mut, 如果鎖成功,再進行條件的判斷(至於為什麼用WHILE,即在被喚醒之後還要再判斷,後面有原因分析),如果滿足條件,則被喚醒進行處理,最後釋放互斥量mut。
至於為什麼在被喚醒之後還要再次進行條件判斷(即為什麼要使用while循環來判斷條件),是因為可能有“驚群效應”。有人覺得此處既然是被喚醒的,肯定是滿足條件了,其實不然。如果是多個線程都在等待這個條件,而同時只能有一個線程進行處理,此時就必須要再次條件判斷,以使只有一個線程進入臨界區處理。對此,轉來一段:
引用下POSIX的RATIONALE:
Condition Wait Semantics
It is important to note that when pthread_cond_wait() andpthread_cond_timedwait() return without error, the associated predicatemay still be false. Similarly, when pthread_cond_timedwait() returnswith the timeout error, the associated predicate may be true due to anunavoidable race between the expiration of the timeout and thepredicate state change.
The application needs to recheck the predicate on any return because itcannot be sure there is another thread waiting on the thread to handlethe signal, and if there is not then the signal is lost. The burden ison the application to check the predicate.
Some implementations, particularly on a multi-processor, may sometimescause multiple threads to wake up when the condition variable issignaled simultaneously on different processors.
In general, whenever a condition wait returns, the thread has tore-evaluate the predicate associated with the condition wait todetermine whether it can safely proceed, should wait again, or shoulddeclare a timeout. A return from the wait does not imply that theassociated predicate is either true or false.
It is thus recommended that a condition wait be enclosed in the equivalent of a "while loop" that checks the predicate.
從上文可以看出:
1,pthread_cond_signal在多處理器上可能同時喚醒多個線程,當你只能讓一個線程處理某個任務時,其它被喚醒的線程就需要繼續 wait,while循環的意義就體現在這裡了,而且規范要求pthread_cond_signal至少喚醒一個pthread_cond_wait上的線程,其實有些實現為了簡單在單處理器上也會喚醒多個線程.
2,某些應用,如線程池,pthread_cond_broadcast喚醒全部線程,但我們通常只需要一部分線程去做執行任務,所以其它的線程需要繼續wait.所以強烈推薦此處使用while循環.
其實說白了很簡單,就是pthread_cond_signal()也可能喚醒多個線程,而如果你同時只允許一個線程訪問的話,就必須要使用while來進行條件判斷,以保證臨界區內只有一個線程在處理。