Message142338
On a 8GB RAM box (more than 6GB free), serializing many small objects can eat all memory, while the end result would take around 600MB on an UCS2 build:
$ LANG=C time opt/python -c "import json; l = [1] * (100*1024*1024); encoded = json.dumps(l)"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/antoine/cpython/opt/Lib/json/__init__.py", line 224, in dumps
return _default_encoder.encode(obj)
File "/home/antoine/cpython/opt/Lib/json/encoder.py", line 188, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/home/antoine/cpython/opt/Lib/json/encoder.py", line 246, in iterencode
return _iterencode(o, 0)
MemoryError
Command exited with non-zero status 1
11.25user 2.43system 0:13.72elapsed 99%CPU (0avgtext+0avgdata 27820320maxresident)k
2920inputs+0outputs (12major+1261388minor)pagefaults 0swaps
I suppose the encoder internally builds a large list of very small unicode objects, and only joins them at the end. Probably we could join it by chunks so as to avoid this behaviour. |
|
Date |
User |
Action |
Args |
2011-08-18 15:23:16 | pitrou | set | recipients:
+ pitrou, rhettinger, ezio.melotti |
2011-08-18 15:23:16 | pitrou | set | messageid: <1313680996.25.0.799183861359.issue12778@psf.upfronthosting.co.za> |
2011-08-18 15:23:15 | pitrou | link | issue12778 messages |
2011-08-18 15:23:15 | pitrou | create | |
|