This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rbcollins
Recipients barry, rbcollins, statik
Date 2009-10-25.21:55:50
SpamBayes Score 2.7817835e-06
Marked as misclassified No
Message-id <1256507754.86.0.545134749943.issue7205@psf.upfronthosting.co.za>
In-reply-to
Content
There is a systemic bug in BZ2File where the GIL is released to perform
compression work, and any other thread calling into BZ2File will
deadlock. We noticed in the write method, but inspection of the code
makes it clear that its systemic.

The problem is pretty simple. Say you have two threads and one bz2file
object. One calls write(), the other calls (say) seek(), but it could be
write() or other methods too. Now, its pretty clear that the question
'should these two threads get serialised' could be contentious. So lets
put that aside by saying 'raising an exception or serialising in
arbitrary order would be ok'.

What happens today is:
t1:bz2file.write
   bz2file.lock.acquire
   gil-release
   bz2compression starts
t2:gil-acquired
   bz2file.seek
   bz2file.lock.acquire(wait=1)  <- this thread is stuck now, and has
the GIL
t1:bz2compression finishes
   gil.acquire <- this thread is stuck now, waiting for the GIL

If any owner of the bz2file object lock will release the GIL, *all*
routines that attempt to lock the bz2file object have to release the GIL
if they can't get the lock - blocking won't work. I'm not familiar
enough with the python threading API to know whether its safe to call
without the GIL. If its not then clearly it can't be used to work with
getting the GIL, and lower layer locks should be used.
History
Date User Action Args
2009-10-25 21:55:55rbcollinssetrecipients: + rbcollins, barry, statik
2009-10-25 21:55:54rbcollinssetmessageid: <1256507754.86.0.545134749943.issue7205@psf.upfronthosting.co.za>
2009-10-25 21:55:52rbcollinslinkissue7205 messages
2009-10-25 21:55:50rbcollinscreate