Author nadeem.vawda
Recipients Christophe Simonis, Garen, Nam.Nguyen, amaury.forgeotdarc, arekm, asvetlov, barry, doko, eric.araujo, georg.brandl, jcea, jeremybanks, lars.gustaebel, leonov, loewis, nadeem.vawda, nicdumz, nikratio, ockham-razor, pitrou, proyvind, rcoyner, shirish, strombrg, thedjatclubrock, tshepang, vstinner, ysj.ray
Date 2011-10-11.10:03:31
Awesome stuff! I'll post an updated patch during the course of the day.

Martin: I've been having problems with Rietveld lately, so I'm posting
my replies to your comments here instead.

>> Modules/_lzmamodule.c:115: return _PyBytes_Resize(buf, size + BIGCHUNK);
> This has quadratic performance.

Correct. I copied the algorithm from _io.FileIO, under the assumption
that there was a reason for not using a simpler O(n log n) doubling
strategy. Do you know of any reason for this? Or is it safe to ignore it?

>> Modules/_lzmamodule.c:364: Py_BEGIN_ALLOW_THREADS
> It seems that the Windows version at least is not thread-safe. If so, you
> would need an LZMA lock when releasing the GIL.

Does the class need to be thread-safe, though? ISTM that there isn't any
sensible use case for having two threads feeding data through the same
compressor concurrently.

(If we *do* want thread-safety, then it doesn't matter whether the
underlying lib is internally thread-safe or not. We would still need to
guard against the possibility of the _lzmamodule.c code in one thread
modifying the lzma_stream's input or output pointer while lzma_code is
operating on the stream's data in another thread.)
