Author serhiy.storchaka
Recipients kristjan.jonsson, larry, loewis, pitrou, serhiy.storchaka, vajrasky, vstinner
Date 2015-01-28.16:45:22
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1422463522.54.0.923277922118.issue20416@psf.upfronthosting.co.za>
In-reply-to
Content
Here is more general solution. For simple values (ints, floats, complex numbers, short strings) it is faster to use the value itself as a key than create new integer object (id).

Without the patch:

data              ver.  dumps(ms)  loads(ms)  size(KiB)

genData            2      103.9      186.4     4090.7
genData            3      250.3      196.8     4090.7
genData            4      223.5      182.5     3651.3

[1000]*10**6       2       98.6      134.8     4882.8
[1000]*10**6       3      491.1       62.2     4882.8
[1000]*10**6       4      494.9       62.1     4882.8

[1000.0]*10**6     2      173.5      158.4     8789.1
[1000.0]*10**6     3      494.8       62.2     4882.8
[1000.0]*10**6     4      491.4       62.8     4882.8

[1000.0j]*10**6    2      288.8      221.4    16601.6
[1000.0j]*10**6    3      493.6       62.4     4882.8
[1000.0j]*10**6    4      489.2       62.0     4882.8

20 pydecimals      2       85.0      114.7     3936.5
20 pydecimals      3       97.2       44.3     3373.4
20 pydecimals      4       86.2       40.0     3297.5


With marshal3_numbers.patch:

data              ver.  dumps(ms)  loads(ms)  size(KiB)

genData            3      108.4      187.5     4090.7
genData            4       83.0      179.3     3651.3

[1000]*10**6       3      104.2      145.8     4882.8
[1000]*10**6       4      103.9      147.0     4882.8

[1000.0]*10**6     3      177.4      154.5     8789.1
[1000.0]*10**6     4      177.1      164.2     8789.1

[1000.0j]*10**6    3      501.6       61.1     4882.8
[1000.0j]*10**6    4      501.6       62.3     4882.8

20 pydecimals      3       95.2       41.9     3373.4
20 pydecimals      4       83.5       38.5     3297.5


With marshal_refs_by_value.patch:

data              ver.  dumps(ms)  loads(ms)  size(KiB)

genData            3      150.4      197.0     4090.7
genData            4      122.1      184.8     3651.3

[1000]*10**6       3      169.1       62.3     4882.8
[1000]*10**6       4      169.2       62.2     4882.8

[1000.0]*10**6     3      313.5       62.2     4882.8
[1000.0]*10**6     4      312.8       62.3     4882.8

[1000.0j]*10**6    3      410.6       62.5     4882.8
[1000.0j]*10**6    4      410.5       62.3     4882.8

20 pydecimals      3       68.5       41.1     3373.4
20 pydecimals      4       57.5       39.8     3297.5

The marshal_refs_by_value.patch produces data so compact as unpatched code does, but dumps faster. Usually it dumps slower than marshal3_numbers.patch, but may produce smaller data and loads faster. Surprisingly it dumps the code of the _pydecimal module faster.

As side effect the patch can turn some simple equal values to identical objects.
History
Date User Action Args
2015-01-28 16:45:22serhiy.storchakasetrecipients: + serhiy.storchaka, loewis, pitrou, kristjan.jonsson, vstinner, larry, vajrasky
2015-01-28 16:45:22serhiy.storchakasetmessageid: <1422463522.54.0.923277922118.issue20416@psf.upfronthosting.co.za>
2015-01-28 16:45:22serhiy.storchakalinkissue20416 messages
2015-01-28 16:45:22serhiy.storchakacreate