This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author nadeem.vawda
Recipients nadeem.vawda, serhiy.storchaka
Date 2012-11-05.00:29:23
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
When calling zlib.Decompress.decompress() with a max_length argument,
if the input data is not full consumed, the next_in pointer in the
z_stream struct are left pointing into the data object, but the
decompressor does not hold a reference to this object. This same
pointer is reused (perhaps unintentionally) if flush() is called
without calling decompress() again.

If the data object gets deallocated between the calls to decompress()
and to flush(), zlib will then try to access this deallocated memory,
and most likely return bogus output (or segfault). See the attached
script for a demonstration.

I see two potential solutions:

  1. Set avail_in to zero in flush(), so that it does not try to use
     leftover data (or whatever is else where that data used to be).

  2. Have decompress() check if there is leftover data, and if so,
     save a reference to the object until a) we consume the rest of
     the data in flush(), or b) discard it in a subsequent call to

Solution 2 would be less disruptive to code that depends on the existing
behavior (in non-pathological cases), but I'm don't like the maintenance
burden of adding yet another thing to keep track of to the decompressor
state. The PyZlib_objdecompress function is complex enough as it is, and
we can expect more bugs like this to creep in the more we cram additional
logic into it. So I'm more in favor of solution 1.

Any thoughts?
Date User Action Args
2012-11-05 00:29:24nadeem.vawdasetrecipients: + nadeem.vawda, serhiy.storchaka
2012-11-05 00:29:24nadeem.vawdasetmessageid: <>
2012-11-05 00:29:24nadeem.vawdalinkissue16411 messages
2012-11-05 00:29:23nadeem.vawdacreate