This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author dsvensson
Recipients amaury.forgeotdarc, daniel.urban, docs@python, dsvensson, loewis, neologix, pitrou, vstinner
Date 2011-08-18.16:23:22
SpamBayes Score 7.382983e-15
Marked as misclassified No
Message-id <1313684603.6.0.954302955841.issue12775@psf.upfronthosting.co.za>
In-reply-to
Content
using: (except in python2.5 case where simplejson is used, which ought to be the same thing right?)
import time, gc, json, sys

def read_json_blob():
	t0 = time.time()
	fd = open("datatest1.json")
	data = fd.read()
	fd.close()
	t1 = time.time()
	parsed = json.loads(data)
	t2 = time.time()
	print("read file in %.2fs, parsed json in %.2fs, total of %.2fs" % (t1-t0, t2-t1, t2-t0))

if len(sys.argv) > 1 and sys.argv[1] == "nogc":
	gc.disable()

read_json_blob()
print(gc.collect())

daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python3.2 gc.py nogc
read file in 1.34s, parsed json in 2.74s, total of 4.07s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python3.2 gc.py
read file in 1.33s, parsed json in 2.71s, total of 4.05s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py
read file in 0.89s, parsed json in 56.03s, total of 56.92s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py nogc
read file in 0.89s, parsed json in 56.38s, total of 57.27s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.7 gc.py
read file in 0.89s, parsed json in 3.87s, total of 4.75s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.7 gc.py nogc
read file in 0.89s, parsed json in 3.91s, total of 4.80s
0
daniel@aether:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.5 gc.py
read file in 0.11s, parsed json in 53.00s, total of 53.11s
0
daniel@aether:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.5 gc.py nogc
read file in 0.14s, parsed json in 53.13s, total of 53.28s
0

Everything is equally slow.. no weird things there, except that Python 3.2 seems to take more time to load the file. Nice performance improvement of the json module in 3.2 compared to older Python versions.


Next up. Trying with cjson which decodes via a binary module:

import time, gc, cjson, sys

def read_json_blob():
	t0 = time.time()
	fd = open("datatest1.json")
	data = fd.read()
	fd.close()
	t1 = time.time()
	parsed = cjson.decode(data)
	t2 = time.time()
	print("read file in %.2fs, parsed json in %.2fs, total of %.2fs" % (t1-t0, t2-t1, t2-t0))

if len(sys.argv) > 1 and sys.argv[1] == "nogc":
	gc.disable()

read_json_blob()
print(gc.collect())

daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py
read file in 0.89s, parsed json in 2.58s, total of 3.46s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py nogc
read file in 0.89s, parsed json in 1.44s, total of 2.33s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.7 gc.py nogc
read file in 0.89s, parsed json in 1.53s, total of 2.42s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.7 gc.py
read file in 0.89s, parsed json in 1.54s, total of 2.43s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py nogc
read file in 0.89s, parsed json in 1.44s, total of 2.33s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py
read file in 0.89s, parsed json in 2.58s, total of 3.47s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py
read file in 0.89s, parsed json in 2.58s, total of 3.47s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py nogc
read file in 0.89s, parsed json in 1.43s, total of 2.32s
0
daniel@aether:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.5 gc.py
read file in 0.14s, parsed json in 1.58s, total of 1.73s
0
daniel@aether:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.5 gc.py nogc
read file in 0.16s, parsed json in 1.07s, total of 1.23s
0
daniel@aether:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.5 gc.py
read file in 0.14s, parsed json in 1.58s, total of 1.72s
0
daniel@aether:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.5 gc.py nogc
read file in 0.14s, parsed json in 1.06s, total of 1.20s

The file is actually a bit too small for good measurement when using cjson, but interesting point here is obviously the huge difference between GC and no GC in Python 2.5, and quite a bit win in 2.6 too, which becomes a lot more apparent with larger files.

Another interesting thing is that Python 2.6 is consistently faster than 2.7 when the GC is disabled in 2.6, compared to both enabled and disabled in 2.7. The cjson isn't compatible with Python 3.2 so I cannot verify how things work there.

So overall it looks like it's less of a problem in newer versions of Python. We are phasing out the software that is deployed on Debian Lenny so it's a problem that will go away. I don't think I have any objections with closing this ticket again.
History
Date User Action Args
2011-08-18 16:23:23dsvenssonsetrecipients: + dsvensson, loewis, amaury.forgeotdarc, pitrou, vstinner, daniel.urban, neologix, docs@python
2011-08-18 16:23:23dsvenssonsetmessageid: <1313684603.6.0.954302955841.issue12775@psf.upfronthosting.co.za>
2011-08-18 16:23:23dsvenssonlinkissue12775 messages
2011-08-18 16:23:22dsvenssoncreate