msg253371 - (view) |
Author: Herbert (prinsherbert) |
Date: 2015-10-23 09:37 |
I very often want to use pickle to store huge objects, such that I do not need to recalculate them again.
However, I noticed that pickle uses O(n) (for n the object size in memory) amount of memory. That is, using python 3:
data = {'%06d' % i: i for i in range(30 * 1000 ** 2)}
# data consumes a lot of my 8GB ram
import pickle
with open('dict-database.p3', 'wb') as f: pickle.dump(data, f)
# I have to kill the process, in order to not overflow in memory. If I don't, the OS crashes. IMHO the OS should never crash due to python.
I don't think pickle should require a O(n) memory overhead.
|
msg253374 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-10-23 10:12 |
That is because a pickler keeps track of all pickled objects. This is needed to preserve identity and support recursive objects.
You can disable memoizing by setting the "fast" attribute of the Pickler object.
def fastdump(obj, file):
p = pickle.Pickler(file)
p.fast = True
p.dump(obj)
But you can't pickle recursive objects in the "fast" mode.
|
msg253378 - (view) |
Author: Herbert (prinsherbert) |
Date: 2015-10-23 11:22 |
That sound reasonable regarding why O(n), but it does not explain why linux crashes (I've seen this on two ubuntu systems)if pickle runs out of memory.
|
msg253399 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2015-10-24 08:25 |
In what way does the OS crash? Are there any kernel messages? Or is this the python executable crashing? Again, if so, what messages are printed?
In any event, if this really is an OS crash, then it's a Linux bug and should be reported to them.
|
msg253469 - (view) |
Author: Herbert (prinsherbert) |
Date: 2015-10-26 12:18 |
Hi Eric,
I would assume that for the right range-parameter (in my case 30 * 1000 ** 2), which just fits in memory, your system would also crash after a pickle.dump. That is, I had this behavior on two of my machine both running a Ubuntu setup though.
Nevertheless, if you give me some time I'm happy to check my dmesg and any log you wish. I find it strange that sometimes I get a MemoryError when I run out of memory (in particular when using numpy), and sometimes the system crashes (in particular when using other python-stuff). Therefore I don't think this is pickle-specific, or even if this is a bug instead of a 'feature'.
|
msg253516 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-10-27 04:28 |
Perhaps by OS crash you mean either the Linux out-of-memory (OOM) killer, that takes a hueristic stab at killing the right process, or Linux running almost out of memory, and everything grinding to a halt presumably because each task switch needs to re-read its program off the hard disk.
If either is the case, I understand this is part of Linux’s design, called “memory overcommit” or something. It is possible to disable it, though I haven’t tried myself, and many programs (probably including Python) are apparently not compatible.
|
msg254142 - (view) |
Author: Lukas Lueg (ebfe) |
Date: 2015-11-05 21:10 |
I very strongly doubt that it actually crashes your kernel - it basically can't. Your desktop becomes unresponsive for up to several minutes as the kernel has paged out about every single bit of memory to disk, raising access times by several orders of magnitude. Disable your swap and try again, it will just die.
|
msg254381 - (view) |
Author: Herbert (prinsherbert) |
Date: 2015-11-09 11:56 |
It may be fair to note that I have no swap installed on one of the machines, just 16GB of RAM, on which the 'crash' happens. Hence I'm not sure how this affects paging, I would think there is no paging if there is no swap.
I can verify that the machine is 'stuck' for more than just several minutes (at least 30 minutes), nevertheless cannot confirm if this is due to the desktop environment or actually the kernel. I would agree to verify this first when I have access to the specific machines again.
Thank you for your input so far!
|
msg254387 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2015-11-09 13:16 |
It's a Linux issue. Disable overcommitting of memory (at your own
peril) or set user limits (for example with djb's softlimit), then
the process will be killed instead of freezing the machine.
|
msg254390 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-11-09 14:37 |
There is a workaround for memory consumption, and Linux freezing is not Python issue.
|
msg254424 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-11-10 00:33 |
FWIW my usual workaround is to enable Linux’s SysRq handler, then press Ctrl+Alt+(SysRq, F) to manually invoke the OOM killer. It beats waiting between 30 and infinity minutes for it to manually kick in :)
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:23 | admin | set | github: 69651 |
2015-11-10 00:33:14 | martin.panter | set | messages:
+ msg254424 |
2015-11-09 14:37:24 | serhiy.storchaka | set | status: open -> closed resolution: wont fix messages:
+ msg254390
stage: resolved |
2015-11-09 13:16:51 | skrah | set | nosy:
+ skrah messages:
+ msg254387
|
2015-11-09 11:56:49 | prinsherbert | set | messages:
+ msg254381 |
2015-11-05 21:10:24 | ebfe | set | nosy:
+ ebfe messages:
+ msg254142
|
2015-10-27 04:28:35 | martin.panter | set | nosy:
+ martin.panter messages:
+ msg253516
|
2015-10-26 12:18:05 | prinsherbert | set | messages:
+ msg253469 |
2015-10-24 08:25:07 | eric.smith | set | nosy:
+ eric.smith messages:
+ msg253399
|
2015-10-23 11:22:42 | prinsherbert | set | messages:
+ msg253378 |
2015-10-23 10:12:14 | serhiy.storchaka | set | nosy:
+ alexandre.vassalotti, serhiy.storchaka, pitrou messages:
+ msg253374
|
2015-10-23 09:38:47 | prinsherbert | set | type: performance versions:
+ Python 3.4 |
2015-10-23 09:37:23 | prinsherbert | create | |