Title: Unpickler failing with PicklingError at frame end on readline due to a broken comparison
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.5, Python 3.4
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: CensoredUsername, pitrou, python-dev, serhiy.storchaka
Priority: normal Keywords: needs review, patch

Created on 2014-12-20 23:24 by CensoredUsername, last changed 2015-01-26 10:27 by serhiy.storchaka. This issue is now closed.

File name Uploaded Description Edit
pickle_frame_readline.patch serhiy.storchaka, 2014-12-21 08:45 review
Messages (6)
msg232984 - (view) Author: CensoredUsername (CensoredUsername) Date: 2014-12-20 23:24
If a pickle frame ends at the end of a pickle._Unframer.readline() call then an UnpicklingError("pickle exhausted before end of frame") will unconditionally be raised due to a faulty check if the frame ended before the line ended.

It concerns this conditional in pickle._Unframer.readline, line 245 in

if data[-1] != b'\n':
    raise UnpicklingError(
        "pickle exhausted before end of frame")

This comparison will always evaluate to True even if data ends in a newline. This is caused by data being a bytes object, and such data[-1] will evaluate to 10 in case of data ending in a newline. 10 != b'\n' will then always evaluate to True due to the type mismatch, and the UnpicklingError will be raised.

This error can be corrected by slicing an actual one character bytes object like:

if data[-1:] != b'\n':
    raise UnpicklingError(
        "pickle exhausted before end of frame")

Or by comparing against the numeric representation of b'\n':

if data[-1] != b'\n'[0]:
    raise UnpicklingError(
        "pickle exhausted before end of frame")
msg232988 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-12-21 02:32
Thanks for the report. Do you have actual data that can exhibit the problem?
msg232991 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-12-21 07:57
readline() is used only when unpickle opcodes PERSID, INT, LONG, FLOAT, STRING, UNICODE, INST, GLOBAL, GET, PUT. These opcodes are not used with protocol 4 (all opcodes except GLOBAL is used only with protocol 0, and GLOBAL is used with protocol <= 3). Frames are used only with protocol 4. So there is very small chance to meet this bug in real data. But it is not zero, artificial pickled data which mixes FRAME with protocol 0 opcodes can be constructed by third-party software for some reasons.

Artificial example:

>>> pickletools.dis(b"\x80\x04\x95\x05\x00\x00\x00\x00\x00\x00\x00I42\n.")
    0: \x80 PROTO      4
    2: \x95 FRAME      5
   11: I    INT        42
   15: .    STOP
highest protocol among opcodes = 4
msg232996 - (view) Author: CensoredUsername (CensoredUsername) Date: 2014-12-21 10:38
Indeed. In my case the problem was caused a subclassed Pickler which still used GLOBAL instead of STACK_GLOBAL in protocol 4.

My own minimized test case was:

data = b"\x80\x04\x95\x11\x00\x00\x00\x00\x00\x00\x00cpickle\nPickler\n."
>>> pickletools.dis(data)
    0: \x80 PROTO      4
    2: \x95 FRAME      17
   11: c    GLOBAL     'pickle Pickler'
   27: .    STOP
highest protocol among opcodes = 4

which should return pickle.Pickler
msg234234 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-01-18 09:57
If there are no objections I'm going to commit the patch.
msg234724 - (view) Author: Roundup Robot (python-dev) Date: 2015-01-26 08:38
New changeset d5e13b74d377 by Serhiy Storchaka in branch '3.4':
Issue #23094: Fixed readline with frames in Python implementation of pickle.

New changeset c347c21e5afa by Serhiy Storchaka in branch 'default':
Issue #23094: Fixed readline with frames in Python implementation of pickle.
Date User Action Args
2015-01-26 10:27:29serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2015-01-26 08:38:48python-devsetnosy: + python-dev
messages: + msg234724
2015-01-18 09:57:55serhiy.storchakasetassignee: serhiy.storchaka
messages: + msg234234
2014-12-30 09:34:24serhiy.storchakasetkeywords: + needs review
stage: needs patch -> patch review
2014-12-21 10:38:05CensoredUsernamesetmessages: + msg232996
2014-12-21 08:45:51serhiy.storchakasetfiles: + pickle_frame_readline.patch
keywords: + patch
2014-12-21 07:57:26serhiy.storchakasetmessages: + msg232991
components: + Library (Lib)
stage: needs patch
2014-12-21 02:32:38pitrousetnosy: + serhiy.storchaka, pitrou

messages: + msg232988
versions: + Python 3.5
2014-12-20 23:24:32CensoredUsernamecreate