This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zlib.Decompress.decompress() retains pointer to input buffer without acquiring reference to it
Type: behavior Stage: resolved
Components: Versions: Python 3.2, Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: nadeem.vawda, python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2012-11-05 00:29 by nadeem.vawda, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
zlib_stale_ptr.py nadeem.vawda, 2012-11-05 00:29
issue16411.patch serhiy.storchaka, 2012-11-05 11:14 review
Messages (5)
msg174853 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-11-05 00:29
When calling zlib.Decompress.decompress() with a max_length argument,
if the input data is not full consumed, the next_in pointer in the
z_stream struct are left pointing into the data object, but the
decompressor does not hold a reference to this object. This same
pointer is reused (perhaps unintentionally) if flush() is called
without calling decompress() again.

If the data object gets deallocated between the calls to decompress()
and to flush(), zlib will then try to access this deallocated memory,
and most likely return bogus output (or segfault). See the attached
script for a demonstration.

I see two potential solutions:

  1. Set avail_in to zero in flush(), so that it does not try to use
     leftover data (or whatever is else where that data used to be).

  2. Have decompress() check if there is leftover data, and if so,
     save a reference to the object until a) we consume the rest of
     the data in flush(), or b) discard it in a subsequent call to
     decompress().

Solution 2 would be less disruptive to code that depends on the existing
behavior (in non-pathological cases), but I'm don't like the maintenance
burden of adding yet another thing to keep track of to the decompressor
state. The PyZlib_objdecompress function is complex enough as it is, and
we can expect more bugs like this to creep in the more we cram additional
logic into it. So I'm more in favor of solution 1.

Any thoughts?
msg174889 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-05 11:14
The decompressor does not hold a reference to the data object, but it holds a reference to the data. It's the unconsumed_tail attribute.

The patch is simple.
msg175310 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-11-11 02:20
New changeset c3828831861c by Nadeem Vawda in branch '2.7':
Issue #16411: Fix a bug where zlib.decompressobj().flush() might try to access previously-freed memory.
http://hg.python.org/cpython/rev/c3828831861c

New changeset a1db815d0829 by Nadeem Vawda in branch '3.2':
Issue #16411: Fix a bug where zlib.decompressobj().flush() might try to access previously-freed memory.
http://hg.python.org/cpython/rev/a1db815d0829

New changeset a7934fe2927e by Nadeem Vawda in branch '3.3':
Issue #16411: Fix a bug where zlib.decompressobj().flush() might try to access previously-freed memory.
http://hg.python.org/cpython/rev/a7934fe2927e

New changeset d63c751e9f01 by Nadeem Vawda in branch 'default':
Issue #16411: Fix a bug where zlib.decompressobj().flush() might try to access previously-freed memory.
http://hg.python.org/cpython/rev/d63c751e9f01
msg175311 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-11-11 02:28
New changeset be40a10d553a by Nadeem Vawda in branch '2.7':
Fix typo in backporting fix of issue #16411 to 2.7.
http://hg.python.org/cpython/rev/be40a10d553a
msg175312 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-11-11 02:37
Ah, that's much nicer than either of my ideas. Patch committed. Thanks!
History
Date User Action Args
2022-04-11 14:57:38adminsetgithub: 60615
2012-11-11 02:37:51nadeem.vawdasetstatus: open -> closed
resolution: fixed
messages: + msg175312

stage: patch review -> resolved
2012-11-11 02:28:41python-devsetmessages: + msg175311
2012-11-11 02:20:23python-devsetnosy: + python-dev
messages: + msg175310
2012-11-05 11:14:45serhiy.storchakasetstage: needs patch -> patch review
2012-11-05 11:14:01serhiy.storchakasetfiles: + issue16411.patch
keywords: + patch
messages: + msg174889
2012-11-05 00:29:24nadeem.vawdacreate