Author mark.dickinson
Recipients mark.dickinson
Date 2009-02-18.22:45:01
SpamBayes Score 1.36751e-07
Marked as misclassified No
Message-id <>
Two closely related issues in Python/marshal.c, involving writing and 
reading of variable-length objects (lists, strings, long integers, ...)

(1) The w_object function in marshal contains many instances of code 
like the following:

else if (PyList_CheckExact(v)) {
	w_byte(TYPE_LIST, p);
	n = PyList_GET_SIZE(v);
	w_long((long)n, p);
	for (i = 0; i < n; i++) {
		w_object(PyList_GET_ITEM(v, i), p);

On a 64-bit platform there's potential loss of information here
either in the cast "(long)n" (if sizeof(long) is 4), or in
w_long itself (if sizeof(long) is 8).  Note that w_long, despite
its name, always writes exactly 4 bytes.

There should at least be an exception raised here if n is not
in the range [-2**31, 2**31).  This would make marshalling of
large objects illegal (rather than just wrong).

A more involved fix would allow marshalling of objects of size >= 2**31.  
This would obviously involve changing the marshal format, and would make 
it impossible to marshal a large object on a 64-bit platform and then 
unmarshal it on a 32-bit platform.  The latter may not really be a 
problem, since memory considerations ought to rule that out anyway.

(2) In r_object (and possibly elsewhere) there are corresponding checks 
of the form:

	n = r_long(p);
	if (n < 0 || n > INT_MAX) {
		PyErr_SetString(PyExc_ValueError, "bad marshal data");
		retval = NULL;


if we allow marshalling of objects with more than 2**31-1 elements then 
these error checks can be relaxed.  (And as a matter of principle, 
INT_MAX isn't really right here: an int might be only 16 bits long on 
some strange platforms...).
Date User Action Args
2009-02-18 22:45:06mark.dickinsonsetrecipients: + mark.dickinson
2009-02-18 22:45:06mark.dickinsonsetmessageid: <>
2009-02-18 22:45:03mark.dickinsonlinkissue5308 messages
2009-02-18 22:45:01mark.dickinsoncreate