Issue433625
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2001-06-16 02:17 by shihao, last changed 2022-04-10 16:04 by admin. This issue is now closed.
Messages (13) | |||
---|---|---|---|
msg5065 - (view) | Author: Shih-Hao Liu (shihao) | Date: 2001-06-16 02:17 | |
Mutex should be hold when calling pthread_cond_signal(). This function should look like: PyThread_release_lock(PyThread_type_lock lock) { pthread_lock *thelock = (pthread_lock *)lock; int status, error = 0; dprintf(("PyThread_release_lock(%p) called\n", lock)); status = pthread_mutex_lock( &thelock->mut ); CHECK_STATUS("pthread_mutex_lock[3]"); thelock->locked = 0; /* ***** call pthread_cond_signal before unlock mutex */ status = pthread_cond_signal( &thelock->lock_released ); CHECK_STATUS("pthread_cond_signal"); status = pthread_mutex_unlock( &thelock->mut ); CHECK_STATUS("pthread_mutex_unlock[3]"); /* wake up someone (anyone, if any) waiting on the lock */ } |
|||
msg5066 - (view) | Author: Tim Peters (tim.peters) * | Date: 2001-06-17 21:16 | |
Logged In: YES user_id=31435 Why? It's allowed to signal the condition whether or not the mutex is held. Since changing this can have visible effects on thread scheduling, I'm reluctant to change it without a good reason. |
|||
msg5067 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2001-06-17 21:51 | |
Logged In: YES user_id=6380 Set status to closed -- no need to delete it. |
|||
msg5068 - (view) | Author: Tim Peters (tim.peters) * | Date: 2001-06-17 22:01 | |
Logged In: YES user_id=31435 Ack, did I delete this?! I sure didn't intend to -- didn't even intend to close it. Reopened pending more info. |
|||
msg5069 - (view) | Author: Tim Peters (tim.peters) * | Date: 2001-06-17 23:02 | |
Logged In: YES user_id=31435 Closing this again, as it appears the original submitter deleted it. shihao, if you want to pursure this, open it again. |
|||
msg5070 - (view) | Author: Shih-Hao Liu (shihao) | Date: 2001-06-18 06:01 | |
Logged In: YES user_id=246388 I closed it because I thought the thelock->locked variable will ensure that the PyThread_release_lock will help to protect the condition variable and I was wrong. The linuxthread man page on pthread_cond_signal: A condition variable must always be associated with a mutex, to avoid the race condition where a thread prepares to wait on a condition variable and another thread signals the condition just before the first thread actually waits on it. which means you can't call pthread_cond_signal & pthread_cond_wait on the same condition variable at the same time. And using a mutex to protect them is a good idea. Here is how thing might go wrong with current implementation: thread 1 thread 2 |int PyThread_acquire_lock _ |/** assume lock was acquired | by thread 1, hence locked=0 | & success would be 0 **/ |{ | ... | status = pthread_mutex_lo | CHECK_STATUS("pthread_mut | success = thelock->locked | if (success) thelock->loc | status = pthread_mutex_un | /** thread 2 suspended **/ void PyThread_release_lock _| { | ... | status = pthread_mutex_loc| CHECK_STATUS("pthread_mute| | thelock->locked = 0; | | status = pthread_mutex_unl| /** thread 1 suspend **/ | | CHECK_STATUS("pthread_mut | | if ( !success && waitflag | /* continue trying unti | | /* mut must be locked b | * protocol */ | status = pthread_mutex_ | CHECK_STATUS("pthread_m | while ( thelock->locked | status = pthread_cond |/** thread 2 suspended while | updating shared data ** CHECK_STATUS("pthread_mute| | /* wake up someone (anyone| status = pthread_cond_sign| /** thread 1 update shared | data and corrupt it. **/ | Not sure what the effect would be. It's wouldn't be nice anyway. |
|||
msg5071 - (view) | Author: Tim Peters (tim.peters) * | Date: 2001-06-19 20:52 | |
Logged In: YES user_id=31435 It appears you're concerned that the signal will "get lost" in this scenario. I agree that it may, but it doesn't matter: thread 2's "while (thelock->locked)" test fails because thread 1 already set thelock->locked to 0, so thread 2 doesn't execute its pthread_cond_wait, so it doesn't matter that nobody is *waiting* to see thread 1's pthread_cond_signal. IOW, the condition thread 1 will try to signal has *already* been detected by thread 2, and the signal is useless (because redundant) information. Indeed, it's a beauty of the condition protocol that signals can be sloppy. The linuxthread man page is worded strangely here, but note that it does not say the mutex *must* be locked. They can't, either, because POSIX doesn't require it; while POSIX stds aren't available freely online, some derived specs are, and are much clearer about this; e.g., see the Single Unix Specification, here: <http://www.opengroup.org/onlinepubs/7908799/xsh/pthread_con d_signal.html> I'll tell you why I don't *want* to change this: in Python's use of the global interpreter lock, it's almost always the case that someone is waiting on the lock. By releasing the mutex before signaling, this gives a waiter a chance to run immediately upon calling pthread_cond_signal. Else, because pthread_cond_wait (which the waiters are executing) has to lock the mutex, if the signaler is holding the mutex during the signal, pthread_cond_signal can't finish the job -- it was to go back to the signaler and let it unlock the mutex first before a waiter can proceed. This makes the region of exclusion longer than it needs to be. So if there's not an actual race problem here (& I still don't see one), I don't want to change this. |
|||
msg5072 - (view) | Author: Shih-Hao Liu (shihao) | Date: 2001-06-19 21:41 | |
Logged In: YES user_id=246388 Oops. The problem will be arised if there is a thread 3 called PyThread_acquire_lock after thread 1 set thelock->locked to 0 and before thread 2 calling pthread_mutex_lock. "while (thelock->locked)" will success for thread 2 and it will call pthread_cond_wait and might collide with thread 1's pthread_cond_signal. |
|||
msg5073 - (view) | Author: Tim Peters (tim.peters) * | Date: 2001-06-19 22:06 | |
Logged In: YES user_id=31435 > The problem will be arised if there is a thread 3 > called PyThread_acquire_lock You never mentioned thread 3 again, so I don't know what it has to do with this. > after thread 1 set thelock->locked to 0 and before > thread 2 calling pthread_mutex_lock. I understand both of those. Are you assuming that, e.g., thread 3's PyThread_acquire_lock completes in whole during this gap? I don't know what else you could mean, so let's assume that. > "while (thelock->locked)" will success for thread 2 Sure. > and it will call pthread_cond_wait Yup. > and might collide with thread 1's pthread_cond_signal. What does "collide" mean to you? All the pthread_cond_xxx functions must be implemented as if atomic, so there's no meaningful sense (to me) in which they can collide -- unless they're implemented incorrectly. Assuming they are implemented correctly, it again doesn't matter that thread 2 misses thread 1's signal, because thread *3* exploited the information thread 1 was going to signal, by acquiring the lock. It's actually good that thread 2 isn't bothered with it: there's no real info in the signal anymore (at best, if thread 2 got it, it would wake up and go "oops! it's still locked; I'll wait again"). All that matters now is whether thread 2 gets a chance to see thread *3*'s signal, at the time thread 3 releases the lock. And it will, because thread 3 can't release the lock without acquiring the mutex first, and thread 2 holds the mutex at all times except when in its pthread_cond_wait call (so thread 3 can't release the lock except when thread 2 is in pthread_cond_wait). Note that I'm not at all concerned about "fairness" here, only about races. |
|||
msg5074 - (view) | Author: Shih-Hao Liu (shihao) | Date: 2001-06-20 05:14 | |
Logged In: YES user_id=246388 > I understand both of those. Are you assuming that, e.g., > thread 3's PyThread_acquire_lock completes in whole during > this gap? I don't know what else you could mean, so let's > assume that. Yes, I assume thread 3 completed in this gap and it is possible. "Collide" means while thread 2's pthread_cond_wait was modifiying the internal data structure, thread 1 calls pthread_cond_signal. The point here is not missing signals, the problem is the possiblity that pthread_cond_signal preempt the execution of pthread_cond_wait. I can't find any document says pthread_cond_xxx functions must be automic operations. In pthread_cond_xxx man page, it only mentioned: pthread_cond_wait atomically unlocks the mutex (as per ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pthread_unlock_mutex) and waits for the condition variable ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cond to be signaled. The thread execution is suspended and ^^^^^^^^^^^^^^^^^^^ does not consume any CPU time until the condition variable is signaled. The mutex must be locked by the calling thread on entrance to pthread_cond_wait. Before returning to the calling thread, pthread_cond_wait re-acquires mutex (as per pthread_lock_mutex). Unlocking the mutex and suspending on the condition vari able is done atomically. Thus, if all threads always acquire the mutex before signaling the condition, this guarantees that the condition cannot be signaled (and thus ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ignored) between the time a thread locks the mutex and the ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ time it waits on the condition variable. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ I take it that between the time the mutex is locked by pthread_mutex_lock and unlocked by pthread_cond_wait, you can't call pthread_cond_signal. I also browse through LinuxThread implementation and can't find pthread_cond_xxxx is implemented automically. I found there is another way to fix this without having to call the pthread_cond_signal while holding the mutex. If we do: if (!thelock->locked) status = pthread_cond_signal( &thelock->lock_released ); we guarantee that when pthread_cond_signal is called, the acquire_lock code will not be in the middle of pthread_cond_signal. |
|||
msg5075 - (view) | Author: Shih-Hao Liu (shihao) | Date: 2001-06-20 07:17 | |
Logged In: YES user_id=246388 > I also browse through > LinuxThread implementation and can't find > pthread_cond_xxxx is implemented automically. I found they do call __pthread_lock when updating the priority queue after take a closer look. I guess I have to look at elsewhere to figure out why my Python script is spinning. |
|||
msg5076 - (view) | Author: Tim Peters (tim.peters) * | Date: 2001-10-18 16:18 | |
Logged In: YES user_id=31435 Deleted this bug report, at Shih-Hao Liu's request. |
|||
msg351829 - (view) | Author: Kirill Smelkov (navytux) * | Date: 2019-09-11 11:34 | |
I still believe there is a race here and it can lead to MEMORY CORRUPTION and DEADLOCK: https://bugs.python.org/issue38106. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:04:08 | admin | set | github: 34637 |
2019-09-11 11:34:32 | navytux | set | nosy:
+ navytux messages: + msg351829 |
2001-06-16 02:17:17 | shihao | create |