This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: marshal.dumps represents the same list object differently
Type: behavior Stage: resolved
Components: C API Versions: Python 3.11
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Dreeseaw, eric.smith, rhettinger
Priority: normal Keywords:

Created on 2022-03-02 14:40 by Dreeseaw, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (5)
msg414362 - (view) Author: William Dreese (Dreeseaw) Date: 2022-03-02 14:40
Hello,

I've been working with the marshal package and came across this issue (I think) -

Python 3.9.10 Interpreter
(Same output in 3.11.0a5+ (heads/master:b6b711a1aa) on darwin)
>>> import marshal
>>> var_example = [(1,2,3),(4,5,6)]
>>> var_marshaled = marshal.dumps(var_example)
>>> raw_marshaled = marshal.dumps([(1,2,3),(4,5,6)])
>>> def pp(to_print):
>>>     [print(byt) for byt in to_print]
>>> pp(var_marshaled)
219
2
0
0
0
41
3
233
1
0
0
0
233
2
0
0
0
233
3
0
0
0
169
3
233
4
0
0
0
233
5
0
0
0
233
6
0
0
0
91
2
0
0
0
41
3
233
1
0
0
0
233
2
0
0
0
233
3
0
0
0
169
3
233
4
0
0
0
233
5
0
0
0
233
6
0
0
0
>>> pp(raw_marshaled)
219
2
0
0
0
169
3
233
1
0
0
0
233
2
0
0
0
233
3
0
0
0
169
3
233
4
0
0
0
233
5
0
0
0
233
6
0
0
0
91
2
0
0
0
169
3
233
1
0
0
0
233
2
0
0
0
233
3
0
0
0
169
3
233
4
0
0
0
233
5
0
0
0
233
6
0
0
0

The difference above lies in the byte representation of the tuple type (41 in the variable version and 169 in the raw version). Is this intended behavior?
msg414363 - (view) Author: William Dreese (Dreeseaw) Date: 2022-03-02 14:46
I've made a very bad copy & paste error with the terminal output below, I apologize. The corrected output is

>>> pp(var_marshaled)
91
2
0
0
0
41
3
233
1
0
0
0
233
2
0
0
0
233
3
0
0
0
41
3
233
4
0
0
0
233
5
0
0
0
233
6
0
0
0

>>> pp(raw_marshaled)
91
2
0
0
0
169
3
233
1
0
0
0
233
2
0
0
0
233
3
0
0
0
169
3
233
4
0
0
0
233
5
0
0
0
233
6
0
0
0
msg414366 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2022-03-02 15:47
The difference is the FLAG_REF which is set to 128 (0x80). 

>>> import marshal
>>> var_example = [(1,2,3),(4,5,6)]
>>> vm = marshal.dumps(var_example)
>>> rm = marshal.dumps([(1,2,3),(4,5,6)])
>>> [v ^ r for v, r in zip(vm, rm)]
[128, 0, 0, 0, 0, 128, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 128, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Whether a flag for possible reuse is generated depends on the reference count of a object.

When passing in the list as variable, the reference count is higher than passing it as a literal.


 That flag tells marshal whether to generate an index entry.  Whether that occurs re
msg414367 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2022-03-02 15:49
From https://github.com/python/cpython/blob/main/Python/marshal.c:

41 is:

#define TYPE_SMALL_TUPLE        ')'

The difference between 41 and 169 is 128:

#define FLAG_REF                '\x80' /* with a type, add obj to index */

So the difference is the FLAG_REF bit being set. I'm not sure if that helps you or not.

In any event, this doesn't look like a bug. You might want to ask on python-list or Stack Overflow for more help.
msg414369 - (view) Author: William Dreese (Dreeseaw) Date: 2022-03-02 15:54
You two are both correct, this is not a bug and is the intended functionality.

> The difference between 41 and 169 is 128:

This realization helps a ton. Thanks.
History
Date User Action Args
2022-04-11 14:59:56adminsetgithub: 91056
2022-03-02 15:55:38Dreeseawsetstatus: open -> closed
resolution: not a bug
stage: resolved
2022-03-02 15:54:09Dreeseawsetmessages: + msg414369
2022-03-02 15:49:34eric.smithsetnosy: + eric.smith
messages: + msg414367
2022-03-02 15:47:07rhettingersetnosy: + rhettinger
messages: + msg414366
2022-03-02 14:46:16Dreeseawsetmessages: + msg414363
2022-03-02 14:40:17Dreeseawcreate