Issue 12291: file written using marshal in 3.2 can be read by 2.7, but not 3.2 or 3.3

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/56500

classification

Title:	file written using marshal in 3.2 can be read by 2.7, but not 3.2 or 3.3
Type:	behavior	Stage:	resolved
Components:		Versions:	Python 3.2, Python 3.3

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	Arfrever, amaury.forgeotdarc, benjamin.peterson, djc, georg.brandl, grubert, pitrou, python-dev, vinay.sajip, vstinner
Priority:	release blocker	Keywords:	needs review, patch

Created on 2011-06-09 09:10 by vinay.sajip, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
data.bin	vinay.sajip, 2011-06-09 09:10	Test data - objects written using marshal in Python 3.2
tried_renaming.diff	vinay.sajip, 2011-06-20 06:57		review
ab1c38ffb8d4.diff	vinay.sajip, 2011-06-24 14:28		review
multiple_dump_load_test.patch	grubert, 2011-06-25 16:17	failing test to test_marshal.py	review
multiple_dump_load_read_write_test.patch	grubert, 2011-06-26 08:23	multiple dump write, load read	review
0feab4e7b27f.diff	vinay.sajip, 2011-06-26 19:49	Merged Engelbert's test patches into fix branch.	review
from_antoine_comments.diff	vinay.sajip, 2011-07-01 00:16	Updates in response to Antoine's review comments.	review
update_to_test_importlib.diff	vinay.sajip, 2011-07-01 01:12	Change to test_importlib added	review

Repositories containing patches
http://hg.python.org/sandbox/vsajip#fix12291

Messages (40)
msg137943 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-09 09:10
The attached file 'data.bin' was written using Python 3.2. It can be read by Python 2.7, but in 3.2 and 3.3, after the first object is read, the file pointer is positioned at EOF, causing an error on subsequent reads. A simple test script 'marshtest.py' is below: import marshal import sys f = open('data.bin', 'rb') t = marshal.load(f) print('Python version:', sys.version) print('Object just loaded was:\n%r' % (t,)) print('File position is now at %d' % f.tell()) t = marshal.load(f) print('Object just loaded was:\n%r' % (t,)) print('File position is now at %d' % f.tell()) Results of running it under various Python versions: vinay@eta-natty:~/projects/scratch$ python marshtest.py ('Python version:', '2.7.1+ (r271:86832, Apr 11 2011, 18:05:24) \n[GCC 4.5.2]') Object just loaded was: (u'text', u'alfa', 202, 1.0, '\x00\x00\x00\x01]q\x00K\x03a') File position is now at 52 Object just loaded was: (u'text', u'alfa', 212, 1.0, '\x00\x00\x00\x01]q\x00K\x03a') File position is now at 104 vinay@eta-natty:~/projects/scratch$ python3.2 marshtest.py Python version: 3.2 (r32:88445, Mar 25 2011, 19:28:28) [GCC 4.5.2] Object just loaded was: ('text', 'alfa', 202, 1.0, b'\x00\x00\x00\x01]q\x00K\x03a') File position is now at 53617 Traceback (most recent call last): File "marshtest.py", line 9, in <module> t = marshal.load(f) EOFError: EOF read where object expected vinay@eta-natty:~/projects/scratch$ python3.3 marshtest.py Python version: 3.3a0 (default:8d4d87dd73ae, Apr 2 2011, 14:25:31) [GCC 4.5.2] Object just loaded was: ('text', 'alfa', 202, 1.0, b'\x00\x00\x00\x01]q\x00K\x03a') File position is now at 53617 Traceback (most recent call last): File "marshtest.py", line 9, in <module> t = marshal.load(f) EOFError: EOF read where object expected Note the size of the file is 53617 bytes. vinay@eta-natty:~/projects/scratch$ ls -l data.bin -rw------- 1 vinay vinay 53617 2011-06-09 09:33 data.bin
msg137944 - (view)	Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *	Date: 2011-06-09 09:31
Sadly, marshal.load() looks broken: - The function starts with the comment /* XXX Quick hack -- need to do this differently */ - It starts by calling f.read() which consumes the whole file (and explains the issue reported here) - The code was probably converted too quickly: if (PyBytes_Check(data)) { ... else if (PyBytes_Check(data)) {
msg137966 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-09 13:49
I've added a patch which I think fixes marshal.load(), and would be grateful for a review - it's my first C patch for Python. The test_marshal test passes, though I haven't yet added a test for the failing scenario.
msg137969 - (view)	Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *	Date: 2011-06-09 13:55
- Please replace tabs characters by space - "//" comments are not accepted by some picky C89 compilers Also, calling f.read(1) for each character may have a large performance impact. I don't know how this can be solved though.
msg137973 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2011-06-09 14:35
It should probably be buffered at 512 bytes or some other reasonable number.
msg137987 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2011-06-09 15:24
Sounds blocker-ish enough to me.
msg138012 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-09 16:36
@Amaury: Re formatting and comment style - I will address this. The diff I posted is to the default branch rather than 3.2, though it won't change the main substance of the patch. Re. performance and buffering: I agree this could have an impact (although strings are not actually read a byte at a time), but it's not easy to address because even if you e.g. wrapped the incoming stream with a BufferedReader, you have nowhere to keep the resulting instance in between multiple calls to marshal.load() from external code. For best performance ISTM you'd have to implement your own buffering - but surely this is to be avoided? It can't be the only place where you need good I/O performance from file-like objects. A quick scan of the pickle module didn't show me anything I could easily use, and a key difference from pickle is that you don't have a corresponding "Unmarshaller" object which could hold the state that an Unpickler does, and which would be needed to hold the buffering state. Of course one could implement limited buffering, i.e. for just the duration of a single marshal.load() call, but perhaps this could be done as a later step after evaluating the actual performance of this implementation. Having something that works correctly, albeit slowly, is better than having the current situation ...
msg138024 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-09 19:50
Another thought about buffering, which might be a bit of a show-stopper: if in a particular load() call we read ahead for buffering purposes, we've altered the underlying stream (which might not be seekable to allow position restoring). The next call to load() will be likely to fail, since the underlying stream has been positioned further than expected by the earlier load() call, and the buffer state has been lost. Not sure how easy it is to get around this ...
msg138070 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-10 09:59
Attached is an improved patch: 1. It's against the 3.2 branch, rather than default. 2. It has an added multiple load test in test_marshal.py. 3. There's more error checking for an EOF condition. 4. I've removed tab chars and used /* C89 comments */. Of course it's not as fast as the 2.x version, since that had a limitation on only being able to load from file objects, and a fast path for PyFile using stdio calls directly. Hard to compete with that! Suggestions for improvements gratefully received :-)
msg138160 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-11 15:24
Patch is now in my public sandbox on hg.python.org.
msg138162 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-11 15:38
For some reason, "Create Patch" is failing with a [Errno 2] No such file or directory: '/home/roundup/trackers/tracker/cpython/Doc/Makefile' I've logged an issue on the meta tracker: http://psf.upfronthosting.co.za/roundup/meta/issue405
msg138571 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2011-06-18 08:15
Any reviewers?
msg138585 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2011-06-18 14:42
Why can't you just call fileno() on the file object?
msg138617 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-18 22:41
Sorry I'm being dense, but which file object do you mean?
msg138618 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2011-06-18 22:59
2011/6/18 Vinay Sajip <report@bugs.python.org>: > > Vinay Sajip <vinay_sajip@yahoo.co.uk> added the comment: > > Sorry I'm being dense, but which file object do you mean? The python file object.
msg138623 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-19 00:42
> Benjamin Peterson <benjamin@python.org> added the comment: > > Vinay Sajip <vinay_sajip@yahoo.co.uk> added the comment: > > > > Sorry I'm being dense, but which file object do you mean? > > The python file object. Do you mean special-case handling of the circumstance when the file-like object being marshalled from is actually a file? The existing code paths (from when marshal.load expected only to work with Python file objects) use FILE *, and these are used by other code to read magic numbers etc. I believe. Or perhaps I'm still misunderstanding what you're getting at.
msg138624 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2011-06-19 01:02
2011/6/18 Vinay Sajip <report@bugs.python.org>: > > Vinay Sajip <vinay_sajip@yahoo.co.uk> added the comment: > >> Benjamin Peterson <benjamin@python.org> added the comment: >> > Vinay Sajip <vinay_sajip@yahoo.co.uk> added the comment: >> > >> > Sorry I'm being dense, but which file object do you mean? >> >> The python file object. > > Do you mean special-case handling of the circumstance when the file-like object > being marshalled from is actually a file? The existing code paths (from when > marshal.load expected only to work with Python file objects) use FILE *, and > these are used by other code to read magic numbers etc. I believe. Or perhaps > I'm still misunderstanding what you're getting at. When python uses dump() or load() with a file object, you can call fileno() (and then fopen?) to use it like a C-file object.
msg138625 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-19 11:39
The problem with calling fileno() and fdopen() is that you bypass the buffering information held in BufferedIOReader. The first call works, but the FILE * pointer is now positioned at 4K, rather than just past the end of the object just read. The next call fails. I verified that calling f.tell() after marshal.load(f) returns 4096, rather than just the size of the object read by the load(). Just to be clear, here's what I did in marshal_load: int is_file = 0; int fd; data = PyObject_CallMethod(f, "fileno", ""); if (data == NULL) PyErr_Clear(); else { fd = PyLong_AsLong(data); Py_DECREF(data); is_file = 1; } if (is_file) { rf.readable = NULL; rf.fp = fdopen(fd, "rb"); } else { /* what I was doing before to set up rf / } / and on to the read_object call */
msg138631 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2011-06-19 14:54
2011/6/19 Vinay Sajip <report@bugs.python.org>: > > Vinay Sajip <vinay_sajip@yahoo.co.uk> added the comment: > > The problem with calling fileno() and fdopen() is that you bypass the buffering information held in BufferedIOReader. The first call works, but the FILE * pointer is now positioned at 4K, rather than just past the end of the object just read. The next call fails. But presumably once you have the fd, you can llseek().
msg138636 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-19 15:43
This seems a bit hacky, and I'm not sure how reliable it is. I added this after the read_object call: if (is_file) { PyObject * newpos; int cp, np; cp = ftell(rf.fp); newpos = PyObject_CallMethod(f, "seek", "ii", cp, SEEK_SET); assert(newpos != NULL); np = PyLong_AsLong(newpos); Py_DECREF(newpos); assert(cp == np); } When I run the code lots of times, I sometimes get assertion failures at the assert(newpos != NULL) line. It's not an EOF condition thing, necessarily: I do get expected behaviour at least sometimes when seeking to the end of file.
msg138637 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2011-06-19 15:45
2011/6/19 Vinay Sajip <report@bugs.python.org>: > > Vinay Sajip <vinay_sajip@yahoo.co.uk> added the comment: > > This seems a bit hacky, and I'm not sure how reliable it is. I added this after the read_object call: > > if (is_file) { > PyObject * newpos; > int cp, np; > > cp = ftell(rf.fp); > newpos = PyObject_CallMethod(f, "seek", "ii", cp, SEEK_SET); Why not 0? > assert(newpos != NULL); > np = PyLong_AsLong(newpos); > Py_DECREF(newpos); > assert(cp == np); > } > > When I run the code lots of times, I sometimes get assertion failures at the > > assert(newpos != NULL) That's because the call is failing. Why? > > line. It's not an EOF condition thing, necessarily: I do get expected behaviour at least sometimes when seeking to the end of file. >
msg138661 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-19 20:48
> Benjamin Peterson <benjamin@python.org> added the comment: > > assert(newpos != NULL) > > That's because the call is failing. Why? > It's seemingly because the Python code did a seek (in Python) which was not communicated to the FILE object; after reading all objects from the file, a seek was done to the beginning to start reading again. On the previous call, the FILE would have been at EOF (at a C level). So it seems as if we have to call ftell on the Python object at the beginning of the read, and do an fseek to sync the positions. It's doable, of course, so it's the next thing I'll try, but what I'm worried about is the thing turning into a bit of a rabbit-hole. I haven't looked at the io implementation, but for this sort of synchronisation, it would be better if when based on a FILE object, the io implementation delegated all its operations to the FILE object - but it doesn't look like that's the case, and I'm uncomfortable that there may be some undesirable consequences of that.
msg138665 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-19 21:10
@Benjamin: I missed commenting on your "Why not 0?", but here's the reasoning: one can't assume that the file only contains one object to be read, at the beginning of the file. It may be that some data is being written to file using marshal.dump, interspersed with other data, and then the stream is being read in at a later time, with marshal.load called to load a previously saved object. In that scenario, why would 0 be always the correct offset value to pass to fseek? I am synchronising the Python object with the FILE *, but not the other way around - in the failing case I mentioned, external Python code has positioned the io object to zero, but that of course will be overwritten to point the io object back to where the FILE object is positioned. Perhaps I just need to go the other way ... I'm not sure how reliable a solution will be which tries to work around there apparently being two buffering implementations - at the io level and at the FILE level, and trying to keep them synced in a seemingly ad hoc fashion. It's perfectly possible that I don't know what I'm talking about, so I'd welcome an improved patch using the ideas you've suggested. It's been a very long time since I wrote C stdio code regularly ;-)
msg138669 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2011-06-19 22:10
I think you're right about playing with the bare fd being too fragile. I think a simpler solution is to read say 1024 bytes at a time and buffer it internally.
msg138672 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-19 23:04
> Benjamin Peterson <benjamin@python.org> added the comment: > > I think you're right about playing with the bare fd being too fragile. I think >a simpler solution is to read say 1024 bytes at a time and buffer it >internally. Doesn't this suffer from a similar problem? Namely, external Python code expecting the stream pointer to always be just past the object just read. See my earlier comments pointing out that there's nowhere to store the buffer state between successive calls to marshal.load. And the synchronising can be a problem to achieve with non-seekable streams (including, but not limited to, sockets). I'm not trying to be difficult. No, really! :-)
msg139095 - (view)	Author: engelbert gruber (grubert) *	Date: 2011-06-25 16:17
patch to test_marshal.py that obviously fails in current implementation. IMHO is file might not be seekable one can not cache so maybe do not do it.
msg139151 - (view)	Author: engelbert gruber (grubert) *	Date: 2011-06-26 08:23
add interleaved writing to the same file (which might happen or not?) the tests pass when vinay's marshal.c is applied. sorry for another patch
msg139204 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-26 19:53
I've incorporated Engelbert's two patches into test_marshal.py in my sandbox fix branch. I coalesced them into a single additional test in the existing BugsTestCase.
msg139247 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-06-27 08:35
Le Sun, 26 Jun 2011 19:49:03 +0000, Vinay Sajip <report@bugs.python.org> a écrit : > > Added file: http://bugs.python.org/file22487/0feab4e7b27f.diff Just a nit, could you give descriptive file names to your patches? Hex numbers quickly get confusing.
msg139262 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-27 12:41
> Just a nit, could you give descriptive file names to your patches? > Hex numbers quickly get confusing. Ok - I was under the impression that those names were generated automatically from the changeset hash, and that changing the name arbitrarily would break something. Is it not better/sufficient if I just update the description?
msg139508 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-06-30 18:22
I think the question is: will the slowdown apply to module import, or only to explicit uses of the marshal module? If the latter, then I think we can live with it - we discourage using marshal as a general-purpose serialization scheme anyway. As for the patch: - why do the tests have to carry a large chunk of binary data? if some data is needed, at least it would be nice if it were kept small - not sure either why it needs zlib and base64; just use the repr() of the bytestring - "assertEqual = self.assertEqual" is more of a nuisance than anything else; you're making the code more complicated to read just to save a few keystrokes; same for "assertIsInstance = self.assertIsInstance" - can you add a comment next to the fields you're adding to the marshal C struct? - if the "r_byte" macro isn't used anymore, it should be removed, not commented out - in r_string(), what happens if PyBytes_Check(data) is false? there should probably be a TypeError of some sort - in r_string() again, if data is NULL after the read() call, the exception shouldn't be shadowed by an EOFError - why do you call read(1) at the beginning? it seems to make the code more complicated for no useful purpose
msg139510 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-06-30 19:12
> Antoine Pitrou <pitrou@free.fr> added the comment: > > I think the question is: will the slowdown apply to module import, or only to >explicit uses of the marshal module? If the latter, then I think we can live >with it - we discourage using marshal as a general-purpose serialization scheme >anyway. Thanks for reviewing the patch. As to your point - agreed, and as the marshal code is completely broken now, something that works is an improvement, even if it's slower than optimal. If that turns out to be a problem in practice, it can be fixed. > As for the patch: > - why do the tests have to carry a large chunk of binary data? if some data is >needed, at least it would be nice if it were kept small Agreed, I just used the initial file where I had the problem - didn't know where the problem was, initially. Will look at reducing this. > - not sure either why it needs zlib and base64; just use the repr() of the >bytestring The original data was 53K, so just the repr of the bytestring of the raw data is around 150K. If I zip the data, it's about 4K, but the repr of that data is around 12K. The base64/zlibbed version is 5K. This will be less of an issue when the test data size is reduced. > - "assertEqual = self.assertEqual" is more of a nuisance than anything else; >you're making the code more complicated to read just to save a few keystrokes; >same for "assertIsInstance = self.assertIsInstance" I'm not sure how it's more complicated or harder to understand. I didn't do it just to save keystrokes when writing, it's also to avoid noise when reading. IMO this is a question of personal taste, or is this approach proscribed somewhere? I've certainly seen it in other Python code. > - can you add a comment next to the fields you're adding to the marshal C >struct? Yes, will do. > - if the "r_byte" macro isn't used anymore, it should be removed, not >commented out The commenting out is a temporary measure, the comment will be removed before commit. > - in r_string(), what happens if PyBytes_Check(data) is false? there should >probably be a TypeError of some sort There is a PyBytes_Check in marshal_load which raises a TypeError if it fails. The PyBytes_Check in r_string is only called in the path from marshal_load (because p->readable is only non-NULL in that path) - so for the failure to occur one would need an initial read to deliver bytes, and a subsequent read to deliver non-bytes. Still, I have no problem adding a TypeError raising in r_string if PyBytes_Check fails. > - in r_string() again, if data is NULL after the read() call, the exception >shouldn't be shadowed by an EOFError Good point, I aim to fix this by changing the lower condition to if (!PyErr_Occurred() && (read < n)) { ... } which should be sufficient - do you agree? > - why do you call read(1) at the beginning? it seems to make the code more >complicated for no useful purpose I gave the reasoning in the comment just above the read(1). I agree it makes it a little more complicated, but not onerously so - and the approach fails fast as well as allowing operation with any stream, not just random-seekable ones.
msg139513 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-06-30 19:26
> > - "assertEqual = self.assertEqual" is more of a nuisance than anything else; > >you're making the code more complicated to read just to save a few keystrokes; > >same for "assertIsInstance = self.assertIsInstance" > > I'm not sure how it's more complicated or harder to understand. I didn't do it > just to save keystrokes when writing, it's also to avoid noise when reading. IMO > this is a question of personal taste, or is this approach proscribed somewhere? It's not proscribed, but trying to remove the "self." because it's supposed to be more readable is a bit of a strange thing to do. Also, people reading the test suite should be accustomed to "self.assertEqual" anyway, so there's no point trying to hide it. > > - in r_string(), what happens if PyBytes_Check(data) is false? there should > >probably be a TypeError of some sort > > There is a PyBytes_Check in marshal_load which raises a TypeError if it fails. > The PyBytes_Check in r_string is only called in the path from marshal_load > (because p->readable is only non-NULL in that path) - so for the failure to > occur one would need an initial read to deliver bytes, and a subsequent read to > deliver non-bytes. Still, I have no problem adding a TypeError raising in > r_string if PyBytes_Check fails. Error checking can't just be probabilistic. Perhaps there's a bug in the file-like object; or perhaps it is a non-blocking IO object and read() will return None at times. > > - why do you call read(1) at the beginning? it seems to make the code more > >complicated for no useful purpose > > I gave the reasoning in the comment just above the read(1). I agree it makes it > a little more complicated, but not onerously so - and the approach fails fast as > well as allowing operation with any stream, not just random-seekable ones. Well, it wouldn't fail any slower if you didn't do it, since you need to call read() very soon anyway (presumably as part of the same call to marshal.load()). Failing "fast" doesn't seem to bring anything here. My vote is for removing the complication.
msg139537 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-07-01 00:35
> Antoine Pitrou <pitrou@free.fr> added the comment: > It's not proscribed, but trying to remove the "self." because it's > supposed to be more readable is a bit of a strange thing to do. > Also, people reading the test suite should be accustomed to > "self.assertEqual" anyway, so there's no point trying to hide it. It wasn't particularly about self - I'm not against it. Anyway, it's not a big deal for me, so I've added the selves back :-) > Error checking can't just be probabilistic. Perhaps there's a bug in the > file-like object; or perhaps it is a non-blocking IO object and read() > will return None at times. You're right, so I've raised a TypeError if PyBytes_Check fails in r_string. > Well, it wouldn't fail any slower if you didn't do it, since you need to > call read() very soon anyway (presumably as part of the same call to > marshal.load()). Failing "fast" doesn't seem to bring anything here. My > vote is for removing the complication. Actually I misremembered the complete reason for the call - it was there to additionally check that the passed object has a read method. I also realised - duh - that I can read zero bytes and still get an empty bytes object back, so I've done that, and it does look cleaner. I've also reorganised the marshal_load function a little so it flows better. Your other points have also all been addressed: I've seen that the original test with the long binary data was exercising the same functionality as Engelbert Gruber's later patch was testing. So I've removed my earlier test entirely and added tuples and lists into the data being marshalled in Engelbert's version of the test. As a result of these changes, the zlib/base64 dependency goes away. The new "readable" field of the C struct has a comment saying what it is. The commented out r_byte macro has been removed. Thanks, again, for the review!
msg139539 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-07-01 01:17
As a result of the changes to marshal.c, test_importlib needs a small change: the test_bad_marshal raises an EOFError now, whereas it raised ValueError before. I think it's because the earlier code in marshal didn't properly check for EOF conditions in some places. So I've changes the assertRaises(ValueError) to assertRaises(EOFError). All tests now pass, other than test_ftplib (not related).
msg139588 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-07-01 16:55
Latest patch looks ok to me.
msg139628 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2011-07-02 09:10
Can we get this committed for 3.2.1 then?
msg139661 - (view)	Author: Roundup Robot (python-dev)	Date: 2011-07-02 15:43
New changeset edba722f3b02 by Vinay Sajip in branch '3.2': Closes #12291: Fixed bug which was found when doing multiple loads from one stream. http://hg.python.org/cpython/rev/edba722f3b02
msg139662 - (view)	Author: Roundup Robot (python-dev)	Date: 2011-07-02 16:16
New changeset 42dd11028e94 by Vinay Sajip in branch 'default': Closes #12291 for 3.3 - merged fix from 3.2. http://hg.python.org/cpython/rev/42dd11028e94
msg139663 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2011-07-02 17:52
Thanks!

History
Date	User	Action	Args
2022-04-11 14:57:18	admin	set	github: 56500
2012-03-30 01:25:24	r.david.murray	link	issue14447 superseder
2011-07-02 17:52:23	georg.brandl	set	messages: + msg139663
2011-07-02 16:16:11	python-dev	set	messages: + msg139662
2011-07-02 15:43:18	python-dev	set	status: open -> closed nosy: + python-dev messages: + msg139661 resolution: fixed stage: patch review -> resolved
2011-07-02 09:10:33	georg.brandl	set	messages: + msg139628
2011-07-01 16:55:20	pitrou	set	messages: + msg139588
2011-07-01 16:25:55	vstinner	set	nosy: + vstinner
2011-07-01 01:17:12	vinay.sajip	set	messages: + msg139539
2011-07-01 01:12:51	vinay.sajip	set	files: + update_to_test_importlib.diff
2011-07-01 00:35:11	vinay.sajip	set	messages: + msg139537
2011-07-01 00:16:05	vinay.sajip	set	files: + from_antoine_comments.diff
2011-06-30 19:26:09	pitrou	set	messages: + msg139513
2011-06-30 19:12:50	vinay.sajip	set	messages: + msg139510
2011-06-30 18:22:22	pitrou	set	messages: + msg139508
2011-06-27 12:41:06	vinay.sajip	set	messages: + msg139262
2011-06-27 08:48:26	djc	set	nosy: + djc
2011-06-27 08:35:33	pitrou	set	messages: + msg139247
2011-06-26 19:53:21	vinay.sajip	set	type: behavior messages: + msg139204
2011-06-26 19:49:03	vinay.sajip	set	files: + 0feab4e7b27f.diff
2011-06-26 08:23:14	grubert	set	files: + multiple_dump_load_read_write_test.patch messages: + msg139151
2011-06-25 16:17:25	grubert	set	files: + multiple_dump_load_test.patch nosy: + grubert messages: + msg139095
2011-06-25 00:35:35	terry.reedy	set	versions: - Python 3.4
2011-06-24 14:28:15	vinay.sajip	set	files: + ab1c38ffb8d4.diff
2011-06-20 06:58:27	vinay.sajip	set	files: - marshal-patch2.diff
2011-06-20 06:58:15	vinay.sajip	set	files: - marshal-patch.diff
2011-06-20 06:57:27	vinay.sajip	set	files: + tried_renaming.diff
2011-06-20 03:19:02	r.david.murray	set	nosy: + pitrou
2011-06-19 23:04:53	vinay.sajip	set	messages: + msg138672
2011-06-19 22:10:17	benjamin.peterson	set	messages: + msg138669
2011-06-19 21:10:21	vinay.sajip	set	messages: + msg138665
2011-06-19 20:48:40	vinay.sajip	set	messages: + msg138661
2011-06-19 15:45:46	benjamin.peterson	set	messages: + msg138637
2011-06-19 15:43:11	vinay.sajip	set	messages: + msg138636
2011-06-19 14:54:54	benjamin.peterson	set	messages: + msg138631
2011-06-19 11:39:10	vinay.sajip	set	messages: + msg138625
2011-06-19 01:02:49	benjamin.peterson	set	messages: + msg138624
2011-06-19 00:42:46	vinay.sajip	set	messages: + msg138623
2011-06-18 22:59:22	benjamin.peterson	set	messages: + msg138618
2011-06-18 22:41:57	vinay.sajip	set	messages: + msg138617
2011-06-18 14:42:53	benjamin.peterson	set	messages: + msg138585
2011-06-18 08:15:07	georg.brandl	set	messages: + msg138571
2011-06-11 15:38:56	vinay.sajip	set	messages: + msg138162
2011-06-11 15:24:36	vinay.sajip	set	hgrepos: + hgrepo26 messages: + msg138160
2011-06-10 10:00:00	vinay.sajip	set	stage: patch review
2011-06-10 09:59:37	vinay.sajip	set	keywords: + needs review files: + marshal-patch2.diff messages: + msg138070
2011-06-09 19:50:07	vinay.sajip	set	messages: + msg138024
2011-06-09 16:36:30	vinay.sajip	set	messages: + msg138012
2011-06-09 15:24:18	georg.brandl	set	messages: + msg137987
2011-06-09 15:21:41	Arfrever	set	nosy: + Arfrever
2011-06-09 15:00:48	barry	set	priority: normal -> release blocker nosy: + georg.brandl
2011-06-09 14:35:53	benjamin.peterson	set	nosy: + benjamin.peterson messages: + msg137973
2011-06-09 13:55:08	amaury.forgeotdarc	set	messages: + msg137969
2011-06-09 13:49:09	vinay.sajip	set	files: + marshal-patch.diff keywords: + patch messages: + msg137966
2011-06-09 09:31:47	amaury.forgeotdarc	set	nosy: + amaury.forgeotdarc messages: + msg137944
2011-06-09 09:10:09	vinay.sajip	create