New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unpickler failing with PicklingError at frame end on readline due to a broken comparison #67283
Comments
If a pickle frame ends at the end of a pickle._Unframer.readline() call then an UnpicklingError("pickle exhausted before end of frame") will unconditionally be raised due to a faulty check if the frame ended before the line ended. It concerns this conditional in pickle._Unframer.readline, line 245 in pickle.py: if data[-1] != b'\n':
raise UnpicklingError(
"pickle exhausted before end of frame") This comparison will always evaluate to True even if data ends in a newline. This is caused by data being a bytes object, and such data[-1] will evaluate to 10 in case of data ending in a newline. 10 != b'\n' will then always evaluate to True due to the type mismatch, and the UnpicklingError will be raised. This error can be corrected by slicing an actual one character bytes object like: if data[-1:] != b'\n':
raise UnpicklingError(
"pickle exhausted before end of frame") Or by comparing against the numeric representation of b'\n': if data[-1] != b'\n'[0]:
raise UnpicklingError(
"pickle exhausted before end of frame") |
Thanks for the report. Do you have actual data that can exhibit the problem? |
readline() is used only when unpickle opcodes PERSID, INT, LONG, FLOAT, STRING, UNICODE, INST, GLOBAL, GET, PUT. These opcodes are not used with protocol 4 (all opcodes except GLOBAL is used only with protocol 0, and GLOBAL is used with protocol <= 3). Frames are used only with protocol 4. So there is very small chance to meet this bug in real data. But it is not zero, artificial pickled data which mixes FRAME with protocol 0 opcodes can be constructed by third-party software for some reasons. Artificial example: >>> pickletools.dis(b"\x80\x04\x95\x05\x00\x00\x00\x00\x00\x00\x00I42\n.")
0: \x80 PROTO 4
2: \x95 FRAME 5
11: I INT 42
15: . STOP
highest protocol among opcodes = 4 |
Indeed. In my case the problem was caused a subclassed Pickler which still used GLOBAL instead of STACK_GLOBAL in protocol 4. My own minimized test case was: data = b"\x80\x04\x95\x11\x00\x00\x00\x00\x00\x00\x00cpickle\nPickler\n."
>>> pickletools.dis(data)
0: \x80 PROTO 4
2: \x95 FRAME 17
11: c GLOBAL 'pickle Pickler'
27: . STOP
highest protocol among opcodes = 4 which should return pickle.Pickler |
If there are no objections I'm going to commit the patch. |
New changeset d5e13b74d377 by Serhiy Storchaka in branch '3.4': New changeset c347c21e5afa by Serhiy Storchaka in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: