classification
Title: pickle not 64-bit ready
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.3, Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: alexandre.vassalotti, amaury.forgeotdarc, belopolsky, jcea, jorgsk, nadeem.vawda, nyevik, pitrou, python-dev, santa4nt
Priority: normal Keywords: patch

Created on 2011-03-15 22:58 by nyevik, last changed 2011-12-11 01:22 by jcea. This issue is now closed.

Files
File name Uploaded Description Edit
pickle64.patch pitrou, 2011-08-12 18:03
pickle64-3.3.patch nadeem.vawda, 2011-08-16 19:19 review
pickle64-4.patch pitrou, 2011-08-27 12:56
Messages (28)
msg131060 - (view) Author: Nik Galitsky (nyevik) Date: 2011-03-15 22:58
Python 3.2 on linux (RHEL 5.3) x86_64 build from source code.
Configure options:
./configure --prefix=/scratch/Python-3.2 --enable-big-digits=30 --with-universal-archs=all --with-fpectl --enable-shared
Built with GCC 4.3.3 with major options 
-g3 -O3 -m64 -fPIC.

Testcase that shows the issue:

#import numpy

import pickle
print("begin")
#a = numpy.zeros((2.5e9 / 8,), dtype = numpy.float64)
a = ('a' * (2 ** 31))
print("allocated")
#print(a);
pickle.dumps(a, pickle.DEFAULT_PROTOCOL)
print("end")

The problem as I can see it is that in pickle.py most types defined either as 2 bytes, or 4 bytes.
For example it is peppered with lines like:
self.write(SOMETYPE + pack("<i", n) + obj)
while pickling,
when unpickling:
len = mloads('i' + self.read(4))

Which limits the range and the size of the datatype that can be pickled, if I understand correctly.

replacing in pickle.py above  lines with something like 
  self.write(SOMETYPE + pack("<Q", n) + obj)
  and
len = mloads('Q' + self.read(8))

lets above testcase run to completion.
Othervise it crashes (on Python 2.7.1 with SIGSEGV) on Python 3.2 strace shows:

.......


 open("/scratch/Python-3.2/lib/python3.2/lib-dynload/_pickle.cpython-32m.so", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0755, st_size=412939, ...}) = 0
open("/scratch/hpl005/UIT_test/apps_exc/Python-3.2/lib/python3.2/lib-dynload/_pickle.cpython-32m.so", O_RDONLY) = 5
read(5, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300>\0\0\0\0\0\0"..., 832) = 832
fstat(5, {st_mode=S_IFREG|0755, st_size=412939, ...}) = 0
mmap(NULL, 2185384, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 5, 0) = 0x2b05b5f68000
mprotect(0x2b05b5f7b000, 2093056, PROT_NONE) = 0
mmap(0x2b05b617a000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 5, 0x12000) = 0x2b05b617a000
close(5)                                = 0
close(4)                                = 0
close(3)                                = 0
write(1, "begin\n", 6begin
)                  = 6
mmap(NULL, 4294971392, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b05b617e000
write(1, "allocated\n", 10allocated
)             = 10
mmap(NULL, 8589938688, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b06b617f000
mremap(0x2b06b617f000, 8589938688, 2147487744, MREMAP_MAYMOVE) = 0x2b06b617f000
mmap(NULL, 4294971392, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b0736180000
munmap(0x2b06b617f000, 2147487744)      = 0
munmap(0x2b0736180000, 4294971392)      = 0
write(2, "Traceback (most recent call last"..., 35Traceback (most recent call last):
) = 35
write(2, "  File \"pickle_long.py\", line 9,"..., 45  File "pickle_long.py", line 9, in <module>
) = 45
open("pickle_long.py", O_RDONLY)        = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=251, ...}) = 0
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7ffff9f7c9e0) = -1 ENOTTY (Inappropriate ioctl for device)
fstat(3, {st_mode=S_IFREG|0644, st_size=251, ...}) = 0
lseek(3, 0, SEEK_CUR)                   = 0
dup(3)                                  = 4
fcntl(4, F_GETFL)                       = 0x8000 (flags O_RDONLY|O_LARGEFILE)
fstat(4, {st_mode=S_IFREG|0644, st_size=251, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b06b617f000
lseek(4, 0, SEEK_CUR)                   = 0
read(4, "#import numpy\n\nimport pickle\npri"..., 4096) = 251
close(4)                                = 0
munmap(0x2b06b617f000, 4096)            = 0
lseek(3, 0, SEEK_SET)                   = 0
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "#import numpy\n\nimport pickle\npri"..., 4096) = 251
close(3)                                = 0
write(2, "    pickle.dumps(a, pickle.DEFAU"..., 45    pickle.dumps(a, pickle.DEFAULT_PROTOCOL)
) = 45
write(2, "SystemError: error return withou"..., 48SystemError: error return without exception set
) = 48
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x2b05b118e4c0}, {0x2b05b0e7a570, [], SA_RESTORER, 0x2b05b118e4c0}, 8) = 0
munmap(0x2b05b617e000, 4294971392)      = 0
exit_group(1)                           = ?

Why is this limitation?
Please advise.
msg131062 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-15 23:05
Indeed:

>>> s = b'a' * (2**31)
>>> d = pickle.dumps(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: error return without exception set

There are two aspects to this:
- (bugfix) raise a proper exception when an object too large for handling by pickle is given
- (feature) improve the pickle protocol to handle objects larger than (2**31-1) elements

The improvement to the pickle protocol should probably be considered along other improvements, because we don't want to create a new protocol too often.

See also issue9614.
msg131064 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2011-03-15 23:15
We could resort to the text-based protocol which doesn't have these limitations with respect to object lengths (IIRC). Performance won't be amazing, but we won't have to modify the current pickle protocol.
msg131118 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-03-16 14:18
On Tue, Mar 15, 2011 at 7:05 PM, Antoine Pitrou <report@bugs.python.org> wrote:
..
> - (bugfix) raise a proper exception when an object too large for handling by pickle is given

What would be the "proper exception" here?  With _pickle acceleration
disabled, I get a struct.error:

$ cat p.py
import sys
sys.modules['_pickle'] = None
import pickle
s = b'a' * (2**31)
d = pickle.dumps(s)

$ ./python.exe p.py
Traceback (most recent call last):
  ..
  File "Lib/pickle.py", line 496, in save_bytes
    self.write(BINBYTES + pack("<i", n) + bytes(obj))
struct.error: 'i' format requires -2147483648 <= number <= 2147483647

I would say "proper exception" would be ValueError, but that means
that we should change python implementation in an incompatible way.
msg131192 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-16 22:59
> On Tue, Mar 15, 2011 at 7:05 PM, Antoine Pitrou <report@bugs.python.org> wrote:
> ..
> > - (bugfix) raise a proper exception when an object too large for handling by pickle is given
> 
> What would be the "proper exception" here?

OverflowError. This is the exception that gets raised when some
user-supplied value exceeds some internal limit.
msg131193 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-03-16 23:10
On Wed, Mar 16, 2011 at 6:59 PM, Antoine Pitrou <report@bugs.python.org> wrote:
..
>>
>> What would be the "proper exception" here?
>
> OverflowError. This is the exception that gets raised when some
> user-supplied value exceeds some internal limit.

I don't think so.  OverflowError is a subclass of ArithmeticError and
is raised when result of an arithmetic operation cannot be represented
by the python type.   For example,

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: (34, 'Result too large')

I don't think failing pickle dump should raise an ArithmeticError.
msg131196 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-16 23:27
> I don't think so.  OverflowError is a subclass of ArithmeticError and
> is raised when result of an arithmetic operation cannot be represented
> by the python type.   For example,

If you grep for OverflowError in the C code base, you'll see that it is
in practice used for the present kind of error. Examples:
- "signed short integer is less than minimum"
- "%c arg not in range(0x110000)"
- "size does not fit in an int"
- "module name is too long"
- "modification time overflows a 4 byte field"
- "range too large to represent as a range_iterator"
- "Python int too large to convert to C size_t"
(at this point I am bored of pasting examples but you get the point)
msg131197 - (view) Author: Nik Galitsky (nyevik) Date: 2011-03-16 23:38
Thank you all for your responses. While getting the meaningful error message in this case is very important, the main thing for us is to somehow fix this problem to allow larger objects serialization which is not at all uncommon on a 64-bit machines with large amounts of memory.
This issue affects cPickle as well, I believe, as well cStringIO that uses pickle too, I believe.

So, what are your plans/thoughts - would there be any action on fixing this problem in the near future? I think I grasp the extent of changes that need to be made to Python code, but I think this issue will have to bee addressed soonoer or later anyhow.
msg141284 - (view) Author: Jorgen Skancke (jorgsk) Date: 2011-07-28 07:47
I recently ran into this problem when I tried to multiprocess jobs with large objects (3-4 GB). I have plenty of memory for this, but multiprocessing hangs without error, presumably because pickle hangs without error when multiprocessing tries to pickle the object. I can't offer a solution, but I can verify that the limitation in pickle is affecting Python usage.
msg141981 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-12 18:03
This patch contains assorted improvements for 64-bit compatibility of the pickle module. The protocol still doesn't support >4GB bytes or str objects, but at least its behaviour shouldn't be misleading anymore.
msg142216 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-08-16 19:19
pickle64.patch applies cleanly to 3.2, but not 3.3. I've attached an
adapted version that applies cleanly to 3.3.
msg142415 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-08-19 03:39
I have tried running the tests on a machine with 12GB of RAM, but when I do so,
the new tests get skipped saying "not enough memory", even when I specify "-M 11G"
on the command-line. The problem seems to be the change to the precisionbigmemtest
decorator in test.support. I don't understand what the purpose of the "dryrun"
flag is, but the modified condition for skipping doesn't look right to me.

(Now that I think about it, I should be able to get the tests to run by undoing
that one part of the change. I'll get back to you about the results later today.)
msg142426 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-19 12:23
> I have tried running the tests on a machine with 12GB of RAM, but when I do so,
> the new tests get skipped saying "not enough memory", even when I specify "-M 11G"
> on the command-line.

How much does it say is required?
Did you remove the skips in BigmemPickleTests?

>  The problem seems to be the change to the precisionbigmemtest
> decorator in test.support. I don't understand what the purpose of the "dryrun"
> flag is, but the modified condition for skipping doesn't look right to me.

Well, perhaps I got the logic wrong. Debugging welcome :)
msg142434 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-08-19 12:35
> How much does it say is required?
> Did you remove the skips in BigmemPickleTests?

Yes, I did remove the skips. It says 2GB for some, and 4GB for others.

> Well, perhaps I got the logic wrong. Debugging welcome :)

I'd be glad to do so, but I'm not sure what the aim of the "dryrun" flag is.
Do you want to make it the default that precisionbigmem tests are skipped,
unless the decorator invocation explicitly specifies dryrun=False?
msg142436 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-19 12:38
> I'd be glad to do so, but I'm not sure what the aim of the "dryrun" flag is.
> Do you want to make it the default that precisionbigmem tests are skipped,
> unless the decorator invocation explicitly specifies dryrun=False?

No, the point is to avoid running these tests when -M is not specified.
See what happens with other bigmem tests.
msg142476 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-08-19 16:48
D'oh. I just realized why the -M option wasn't being recognized - I had passed it
after the actual test name, so it was being treated as another test instead of an
option. Sorry for the confusion :/

As for the actual test results, test_huge_bytes_(32|64)b both pass, but
test_huge_str fails with this traceback:

    ======================================================================
    FAIL: test_huge_str (test.test_pickle.InMemoryPickleTests)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/usr/local/google/home/nadeemvawda/code/cpython/3.2/Lib/test/support.py", line 1108, in wrapper
        return f(self, maxsize)
      File "/usr/local/google/home/nadeemvawda/code/cpython/3.2/Lib/test/pickletester.py", line 1151, in test_huge_str
        self.dumps(data, protocol=proto)
    AssertionError: (<class 'ValueError'>, <class 'OverflowError'>) not raised

The same error occurs on the default branch.
msg142477 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-19 16:58
> D'oh. I just realized why the -M option wasn't being recognized - I had passed it
> after the actual test name, so it was being treated as another test instead of an
> option. Sorry for the confusion :/
> 
> As for the actual test results, test_huge_bytes_(32|64)b both pass, but
> test_huge_str fails with this traceback:

Can you replace "_2G" with "_4G" in the decorator for that test?
msg142481 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-08-19 18:01
> Can you replace "_2G" with "_4G" in the decorator for that test?

I'm not at work any more, but I'll try that out on Monday.
msg142756 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-08-22 21:47
> Can you replace "_2G" with "_4G" in the decorator for that test?

When I do that, it pushes the memory usage for the test up to 16GB, which is
beyond what the machine can handle. When I tried with 2.5G (_2G * 5 // 4),
that was enough to make it swap heavily (and in the end the test still failed).

As an aside, it turns out the problem with -M being ignored wasn't due to me
being stupid; it seems that -j doesn't pass the memlimit on to subprocesses.
I'll open a separate issue for this.
msg142759 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-22 21:57
> > Can you replace "_2G" with "_4G" in the decorator for that test?
> 
> When I do that, it pushes the memory usage for the test up to 16GB, which is
> beyond what the machine can handle. When I tried with 2.5G (_2G * 5 // 4),
> that was enough to make it swap heavily (and in the end the test still failed).

Uh, does it? With 4G it should raise OverflowError, and not try to do
anything else.
Could I ask you to try to take a look? :S

> As an aside, it turns out the problem with -M being ignored wasn't due to me
> being stupid; it seems that -j doesn't pass the memlimit on to subprocesses.
> I'll open a separate issue for this.

Running bigmem tests in parallel doesn't make much sense IMO. You want
to run as many of them as you can, which requires that you allocate all
memory to *one* test process.
msg142760 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-08-22 22:03
> Uh, does it? With 4G it should raise OverflowError, and not try to do
> anything else.
> Could I ask you to try to take a look? :S

Sure; I'll see what I can figure out tomorrow.

> Running bigmem tests in parallel doesn't make much sense IMO. You want
> to run as many of them as you can, which requires that you allocate all
> memory to *one* test process.

Yeah, actually running them in parallel isn't a sensible use. But it bit me
because I was just using "make test EXTRATESTOPTS='-uall -M11G test_pickle'".
It would be nice to have a warning so other people don't get confused by the
same problem. I guess that shouldn't be too hard to arrange.
msg142854 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-08-23 18:35
I was playing around with pickling large Unicode strings in an interactive
interpreter, and it seems that you have to have at least 4G chars (not bytes)
to trigger the OverflowError. Consider the following snippet of code:

    out = dumps(data)
    del data
    result = loads(out)
    assert isinstance(result, str)
    assert len(result) == _1G 

With data as (b"a" * _4G) the result is as expected:

    Traceback (most recent call last):
      File "pickle-bigmem-test.py", line 5, in <module>
        out = dumps(data)
    OverflowError: cannot serialize a string larger than 4GB

But with (b"a" * _2G), I get this:

    Traceback (most recent call last):
      File "pickle-bigmem-test.py", line 7, in <module>
        result = loads(out)
    _pickle.UnpicklingError: BINUNICODE pickle has negative byte count
msg142855 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-08-23 18:41
Some more info: the first few bytes of the output for the _2G case are this:

    b'\x80\x03X\x00\x00\x00\x80aaaaaa'
msg142857 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-08-23 18:58
> With data as (b"a" * _4G) the result is as expected:
>
>     Traceback (most recent call last):
>       File "pickle-bigmem-test.py", line 5, in <module>
>         out = dumps(data)
>     OverflowError: cannot serialize a string larger than 4GB
>
> But with (b"a" * _2G), I get this:
>
>     Traceback (most recent call last):
>       File "pickle-bigmem-test.py", line 7, in <module>
>         result = loads(out)
>     _pickle.UnpicklingError: BINUNICODE pickle has negative byte count

Correction: these should be ("a" * _4G) and ("a" * _2G).
msg143064 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-27 12:56
Here is a new patch against 3.2. I can't say it works for sure, but it should be much better. It also adds a couple more tests.
There seems to be a separate issue where pure-Python pickle.py considers 32-bit lengths signed where the C impl considers them unsigned...
msg143166 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-08-29 17:45
Tested the latest patch with -M11G. All tests pass.
msg143180 - (view) Author: Roundup Robot (python-dev) Date: 2011-08-29 21:24
New changeset babc90f3cbf4 by Antoine Pitrou in branch '3.2':
Issue #11564: Avoid crashes when trying to pickle huge objects or containers
http://hg.python.org/cpython/rev/babc90f3cbf4

New changeset 56242682a931 by Antoine Pitrou in branch 'default':
Issue #11564: Avoid crashes when trying to pickle huge objects or containers
http://hg.python.org/cpython/rev/56242682a931
msg143183 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-29 21:43
Should be fixed as far as possible (OverflowErrors will be raised instead of crashing).
Making people actually 64-bit compliant is part of PEP 3154 (http://www.python.org/dev/peps/pep-3154/).
History
Date User Action Args
2012-12-15 19:15:18pitroulinkissue10640 superseder
2011-12-11 01:22:28jceasetnosy: + jcea
2011-08-29 21:43:22pitrousetstatus: open -> closed
resolution: fixed
messages: + msg143183

stage: patch review -> resolved
2011-08-29 21:24:38python-devsetnosy: + python-dev
messages: + msg143180
2011-08-29 17:45:13nadeem.vawdasetmessages: + msg143166
2011-08-27 12:56:13pitrousetfiles: + pickle64-4.patch

messages: + msg143064
2011-08-23 18:58:46nadeem.vawdasetmessages: + msg142857
2011-08-23 18:41:58nadeem.vawdasetmessages: + msg142855
2011-08-23 18:35:50nadeem.vawdasetmessages: + msg142854
2011-08-22 22:03:44nadeem.vawdasetmessages: + msg142760
2011-08-22 21:57:58pitrousetmessages: + msg142759
2011-08-22 21:47:07nadeem.vawdasetmessages: + msg142756
2011-08-19 18:01:34nadeem.vawdasetmessages: + msg142481
2011-08-19 16:58:58pitrousetmessages: + msg142477
2011-08-19 16:48:14nadeem.vawdasetmessages: + msg142476
2011-08-19 12:38:34pitrousetmessages: + msg142436
2011-08-19 12:35:28nadeem.vawdasetmessages: + msg142434
2011-08-19 12:23:03pitrousetmessages: + msg142426
2011-08-19 03:39:36nadeem.vawdasetmessages: + msg142415
2011-08-16 19:20:00nadeem.vawdasetfiles: + pickle64-3.3.patch

messages: + msg142216
2011-08-16 18:50:30nadeem.vawdasetnosy: + nadeem.vawda
2011-08-12 18:03:16pitrousetfiles: + pickle64.patch
versions: - Python 3.1
messages: + msg141981

keywords: + patch
stage: patch review
2011-07-28 07:47:21jorgsksetnosy: + jorgsk
messages: + msg141284
2011-04-26 17:39:50santa4ntsetnosy: + santa4nt
2011-03-16 23:38:51nyeviksetnosy: amaury.forgeotdarc, belopolsky, pitrou, alexandre.vassalotti, nyevik
messages: + msg131197
2011-03-16 23:27:04pitrousetnosy: amaury.forgeotdarc, belopolsky, pitrou, alexandre.vassalotti, nyevik
messages: + msg131196
2011-03-16 23:10:08belopolskysetnosy: amaury.forgeotdarc, belopolsky, pitrou, alexandre.vassalotti, nyevik
messages: + msg131193
2011-03-16 22:59:36pitrousetnosy: amaury.forgeotdarc, belopolsky, pitrou, alexandre.vassalotti, nyevik
messages: + msg131192
2011-03-16 14:18:15belopolskysetnosy: amaury.forgeotdarc, belopolsky, pitrou, alexandre.vassalotti, nyevik
messages: + msg131118
2011-03-15 23:15:52alexandre.vassalottisetnosy: amaury.forgeotdarc, belopolsky, pitrou, alexandre.vassalotti, nyevik
messages: + msg131064
2011-03-15 23:05:52pitrousetnosy: + amaury.forgeotdarc, alexandre.vassalotti, pitrou, belopolsky
title: pickle limits most datatypes -> pickle not 64-bit ready
messages: + msg131062

versions: + Python 3.1, Python 3.3
2011-03-15 22:58:27nyevikcreate