This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author loewis
Recipients Christophe Simonis, Garen, Nam.Nguyen, amaury.forgeotdarc, arekm, asvetlov, barry, doko, eric.araujo, georg.brandl, jcea, jeremybanks, lars.gustaebel, leonov, loewis, nadeem.vawda, nicdumz, nikratio, ockham-razor, pitrou, proyvind, rcoyner, shirish, strombrg, thedjatclubrock, tshepang, vstinner, ysj.ray
Date 2011-10-12.16:11:25
SpamBayes Score 4.3376414e-13
Marked as misclassified No
Message-id <4E95BC2C.2040608@v.loewis.de>
In-reply-to <1318327413.31.0.63298674751.issue6715@psf.upfronthosting.co.za>
Content
>>> Modules/_lzmamodule.c:364: Py_BEGIN_ALLOW_THREADS
>> It seems that the Windows version at least is not thread-safe. If so, you
>> would need an LZMA lock when releasing the GIL.
>
> Does the class need to be thread-safe, though?

As a matter of principle, Python code must not be able to crash the
interpreter or corrupt memory. There are known bugs in this area,
but if it's known in advance that an issue exists, we should avoid
it.

> ISTM that there isn't any
> sensible use case for having two threads feeding data through the same
> compressor concurrently.

Right. So having a per-compressor mutex lock would be entirely
reasonable. I could also accept a per-module lock. I could even
accept the GIL, and if no other code is forth-coming, I would
prefer to keep holding the GIL during comprssion over risking
crashes.

> (If we *do* want thread-safety, then it doesn't matter whether the
> underlying lib is internally thread-safe or not. We would still need to
> guard against the possibility of the _lzmamodule.c code in one thread
> modifying the lzma_stream's input or output pointer while lzma_code is
> operating on the stream's data in another thread.)

I haven't reviewed the module in this respect. If you say that it
wouldn't be thread-safe even if LZMA was compiled as thread-safe,
then this definitely must be fixed.

To elaborate on the policy: giving bogus data in cases of race
conditions is ok; crashing the interpreter or corrupting memory
is not. If bogus data is given, it would be useful if the bogosity
can be specified (e.g. when multiple threads read from the same
POSIX file concurrently, they also get bogus data, but in a manner
where each input byte is given to exactly one thread).
History
Date User Action Args
2011-10-12 16:11:27loewissetrecipients: + loewis, barry, georg.brandl, doko, jcea, amaury.forgeotdarc, arekm, lars.gustaebel, pitrou, vstinner, nadeem.vawda, nicdumz, eric.araujo, Christophe Simonis, rcoyner, proyvind, asvetlov, nikratio, leonov, Garen, ysj.ray, thedjatclubrock, ockham-razor, strombrg, shirish, tshepang, jeremybanks, Nam.Nguyen
2011-10-12 16:11:26loewislinkissue6715 messages
2011-10-12 16:11:25loewiscreate