This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Olivier.Grisel
Recipients Olivier.Grisel, pitrou, serhiy.storchaka
Date 2017-11-09.22:15:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1510265749.39.0.213398074469.issue31993@psf.upfronthosting.co.za>
In-reply-to
Content
I wrote a script to monitor the memory when dumping 2GB of data with python master (C pickler and Python pickler):

```
(py37) ogrisel@ici:~/code/cpython$ python ~/tmp/large_pickle_dump.py
Allocating source data...
=> peak memory usage: 2.014 GB
Dumping to disk...
done in 5.141s
=> peak memory usage: 4.014 GB
(py37) ogrisel@ici:~/code/cpython$ python ~/tmp/large_pickle_dump.py --use-pypickle
Allocating source data...
=> peak memory usage: 2.014 GB
Dumping to disk...
done in 5.046s
=> peak memory usage: 5.955 GB
```

This is using protocol 4. Note that the C pickler is only making 1 useless memory copy instead of 2 for the Python pickler (one for the concatenation and the other because of the framing mechanism of protocol 4).

Here the output with the Python pickler fixed in python/cpython#4353:

```
(py37) ogrisel@ici:~/code/cpython$ python ~/tmp/large_pickle_dump.py --use-pypickle
Allocating source data...
=> peak memory usage: 2.014 GB
Dumping to disk...
done in 6.138s
=> peak memory usage: 2.014 GB
```


Basically the 2 spurious memory copies of the Python pickler with protocol 4 are gone.

Here is the script: https://gist.github.com/ogrisel/0e7b3282c84ae4a581f3b9ec1d84b45a
History
Date User Action Args
2017-11-09 22:15:49Olivier.Griselsetrecipients: + Olivier.Grisel, pitrou, serhiy.storchaka
2017-11-09 22:15:49Olivier.Griselsetmessageid: <1510265749.39.0.213398074469.issue31993@psf.upfronthosting.co.za>
2017-11-09 22:15:49Olivier.Grisellinkissue31993 messages
2017-11-09 22:15:49Olivier.Griselcreate