This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: immense performance problems related to the garbage collector
Type: performance Stage:
Components: Documentation, Interpreter Core Versions: Python 2.6
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: docs@python Nosy List: amaury.forgeotdarc, daniel.urban, docs@python, dsvensson, ezio.melotti, loewis, neologix, pitrou, terry.reedy, vstinner
Priority: normal Keywords:

Created on 2011-08-18 08:00 by dsvensson, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (15)
msg142305 - (view) Author: Daniel Svensson (dsvensson) Date: 2011-08-18 08:00
I've noticed problems with the GC in two applications. In one case the application did not respond to SOAP-requests for 9 seconds, every couple of minutes, when it really shouldn't have taken more than 20ms. In another case we had one operation executed by 20 threads that ought to have taken 0.5 seconds, but instead took about 90 seconds.

Both cases were solved by disabling the garbage collector and hunting down possible circular references. Once this was done, the first example went down to its expected 20ms max latency, and the second one to its 0.5s processing time.


Here is a short python script that demonstrates the issue, the JSON file in this case is 1.2GB large:

> import cjson, time, gc

> def read_json_blob():
>   t0 = time.time()
>   fd = file("mytestfile")
>   data = fd.read()
>   fd.close()
>   t1 = time.time()
>   parsed = cjson.decode(data)
>   t2 = time.time()
>   print "read file in %.2fs, parsed json in %.2fs, total of %.2fs" % \
>                                                   (t1-t0, t2-t1, t2-t0)

> read_json_blob()
read file in 10.57s, parsed json in 531.10s, total of 541.67s

> gc.disable()
> read_json_blob()
read file in 0.59s, parsed json in 15.13s, total of 15.72s

> gc.collect()
0

I don't understand how Python can work like this default, at least not without a warning, to me it looks like a joke gone too far. All documentation ought to recommend people to disable the garbage collector at the first sign of performance problems, or the garbage collector problem should be fixed, this the "Documentation" or "Interpreter Core" Components in the ticket classification.
msg142311 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-08-18 12:11
Which version of Python did you test with?
Can you try with version 3.2?
msg142312 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-08-18 12:13
Can you try to write a short example to reproduce the problem?
msg142319 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-08-18 14:02
There's been some work done on the GC some time ago to address this type of pattern (I think to reduce from quadratic complexity to amortized linear complexity). I'm adding Antoine and Martin who were involved in this.

> Here is a short python script that demonstrates the issue, the JSON
> file in this case is 1.2GB large:

A couple remarks:
- please indicate the Python version you're using
- you should put your test in a script, one with gc.disable() at the top, to avoid using any cache that might be used internally by cjson
- you should perform "echo 3 > /proc/sys/vm/drop_cache" before each run to make sure you start with a cold page/buffer cache (see how fast your file got read the second time you called read_json_blob()?)
msg142324 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-18 14:29
As Charles-François said: it would be nice to know which Python version you are using. Also, if you could test with 2.7 or 3.2 and the standard json module (or the latest simplejson if you prefer).

Also, while Python's performance is not always excellent, it is quite rare for it to be GC-limited, so the advice you are suggesting we give would be misleading for most use cases.

Note that instead of disabling the GC, you can tune its parameters by calling set_threshold. For example:

  u, v, w = gc.get_threshold()
  gc.set_threshold(u, v * 10, w * 50)

(this will make full collections occur 500 times less often)

Or you could disable the GC only when decoding JSON data and reenable it afterwards.
msg142327 - (view) Author: Daniel Svensson (dsvensson) Date: 2011-08-18 14:42
The bug note contains a test script. You just have to generate a huge JSON-blob, or allocate 1 GB or so and pickle it down to file and replace the cjson.decode to pickle.loads, pickle is about 20 times faster without GC.
msg142328 - (view) Author: Daniel Svensson (dsvensson) Date: 2011-08-18 14:46
"Or you could disable the GC only when decoding JSON data and reenable it afterwards."

JSON was just an example, and disabling/enabling the GC is not safe in a multithreaded environment?
msg142331 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-18 15:07
> The bug note contains a test script. You just have to generate a huge
> JSON-blob, or allocate 1 GB or so and pickle it down to file and
> replace the cjson.decode to pickle.loads, pickle is about 20 times
> faster without GC.

You can't say that any "huge JSON blob" will exhibit the same
performance characteristics as yours. I just tested json.loads() with a
rather trivial 300MB JSON string, and it took a couple of seconds (on
2.7). Does it reproduce your workload? I don't know.

> disabling/enabling the GC is not safe in a multithreaded environment?

You are right that disabling/enabling the GC is a global setting. But
that still leaves the option of tuning the parameters.
msg142343 - (view) Author: Daniel Svensson (dsvensson) Date: 2011-08-18 15:40
Working on getting a reasonable large file to test with on my laptop where I have 2.6, 2.7, 3.2. The Python version where the problem became apparent was 2.5.2 as the software is deployed on Debian Lenny.
msg142357 - (view) Author: Daniel Svensson (dsvensson) Date: 2011-08-18 16:23
using: (except in python2.5 case where simplejson is used, which ought to be the same thing right?)
import time, gc, json, sys

def read_json_blob():
	t0 = time.time()
	fd = open("datatest1.json")
	data = fd.read()
	fd.close()
	t1 = time.time()
	parsed = json.loads(data)
	t2 = time.time()
	print("read file in %.2fs, parsed json in %.2fs, total of %.2fs" % (t1-t0, t2-t1, t2-t0))

if len(sys.argv) > 1 and sys.argv[1] == "nogc":
	gc.disable()

read_json_blob()
print(gc.collect())

daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python3.2 gc.py nogc
read file in 1.34s, parsed json in 2.74s, total of 4.07s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python3.2 gc.py
read file in 1.33s, parsed json in 2.71s, total of 4.05s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py
read file in 0.89s, parsed json in 56.03s, total of 56.92s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py nogc
read file in 0.89s, parsed json in 56.38s, total of 57.27s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.7 gc.py
read file in 0.89s, parsed json in 3.87s, total of 4.75s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.7 gc.py nogc
read file in 0.89s, parsed json in 3.91s, total of 4.80s
0
daniel@aether:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.5 gc.py
read file in 0.11s, parsed json in 53.00s, total of 53.11s
0
daniel@aether:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.5 gc.py nogc
read file in 0.14s, parsed json in 53.13s, total of 53.28s
0

Everything is equally slow.. no weird things there, except that Python 3.2 seems to take more time to load the file. Nice performance improvement of the json module in 3.2 compared to older Python versions.


Next up. Trying with cjson which decodes via a binary module:

import time, gc, cjson, sys

def read_json_blob():
	t0 = time.time()
	fd = open("datatest1.json")
	data = fd.read()
	fd.close()
	t1 = time.time()
	parsed = cjson.decode(data)
	t2 = time.time()
	print("read file in %.2fs, parsed json in %.2fs, total of %.2fs" % (t1-t0, t2-t1, t2-t0))

if len(sys.argv) > 1 and sys.argv[1] == "nogc":
	gc.disable()

read_json_blob()
print(gc.collect())

daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py
read file in 0.89s, parsed json in 2.58s, total of 3.46s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py nogc
read file in 0.89s, parsed json in 1.44s, total of 2.33s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.7 gc.py nogc
read file in 0.89s, parsed json in 1.53s, total of 2.42s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.7 gc.py
read file in 0.89s, parsed json in 1.54s, total of 2.43s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py nogc
read file in 0.89s, parsed json in 1.44s, total of 2.33s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py
read file in 0.89s, parsed json in 2.58s, total of 3.47s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py
read file in 0.89s, parsed json in 2.58s, total of 3.47s
0
daniel@neutronstar:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.6 gc.py nogc
read file in 0.89s, parsed json in 1.43s, total of 2.32s
0
daniel@aether:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.5 gc.py
read file in 0.14s, parsed json in 1.58s, total of 1.73s
0
daniel@aether:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.5 gc.py nogc
read file in 0.16s, parsed json in 1.07s, total of 1.23s
0
daniel@aether:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.5 gc.py
read file in 0.14s, parsed json in 1.58s, total of 1.72s
0
daniel@aether:~$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"; python2.5 gc.py nogc
read file in 0.14s, parsed json in 1.06s, total of 1.20s

The file is actually a bit too small for good measurement when using cjson, but interesting point here is obviously the huge difference between GC and no GC in Python 2.5, and quite a bit win in 2.6 too, which becomes a lot more apparent with larger files.

Another interesting thing is that Python 2.6 is consistently faster than 2.7 when the GC is disabled in 2.6, compared to both enabled and disabled in 2.7. The cjson isn't compatible with Python 3.2 so I cannot verify how things work there.

So overall it looks like it's less of a problem in newer versions of Python. We are phasing out the software that is deployed on Debian Lenny so it's a problem that will go away. I don't think I have any objections with closing this ticket again.
msg142359 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-08-18 16:29
IIRC the C acceleration for json were added in 2.7/3.2, and are used automatically when you import json.  In the previous releases the json module was implemented in pure Python.  That explains why it's much slower on Python 2.5/6.
msg142361 - (view) Author: Daniel Svensson (dsvensson) Date: 2011-08-18 16:35
Yes, but this ticket was only opened due to time delta between no-gc and gc, not as much absolute time, but that's a nice improvement.
msg142362 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-18 16:36
> So overall it looks like it's less of a problem in newer versions of
> Python. We are phasing out the software that is deployed on Debian
> Lenny so it's a problem that will go away. I don't think I have any
> objections with closing this ticket again.

Thanks for the comprehensive measurements. Indeed I think it doesn't
seem it is a problem in 2.7/3.2.
msg142505 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-08-19 23:02
2.6 is in security fix only mode and the OP agrees not applicable to current
msg142692 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-08-22 07:58
For the record, the speed improvement in 2.7 likely results from 79276316b94b. GC used to be quadratic with the number of objects, and is now only linear.
History
Date User Action Args
2022-04-11 14:57:20adminsetgithub: 56984
2011-08-22 07:58:22loewissetmessages: + msg142692
2011-08-19 23:02:41terry.reedysetstatus: open -> closed

nosy: + terry.reedy
messages: + msg142505

resolution: out of date
2011-08-18 16:36:42pitrousetmessages: + msg142362
2011-08-18 16:35:55dsvenssonsetmessages: + msg142361
2011-08-18 16:33:23dsvenssonsetversions: - Python 3.1, Python 2.7, Python 3.2, Python 3.3, Python 3.4
2011-08-18 16:29:15ezio.melottisetnosy: + ezio.melotti
messages: + msg142359
2011-08-18 16:23:23dsvenssonsetmessages: + msg142357
2011-08-18 15:40:28dsvenssonsetmessages: + msg142343
2011-08-18 15:07:39pitrousetmessages: + msg142331
2011-08-18 14:46:37dsvenssonsetmessages: + msg142328
2011-08-18 14:42:35dsvenssonsetmessages: + msg142327
2011-08-18 14:29:23pitrousetmessages: + msg142324
2011-08-18 14:02:28neologixsetnosy: + neologix, pitrou, loewis
messages: + msg142319
2011-08-18 12:13:46vstinnersetmessages: + msg142312
2011-08-18 12:12:46vstinnersetnosy: + vstinner
2011-08-18 12:11:26amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg142311
2011-08-18 08:05:48daniel.urbansetnosy: + daniel.urban
2011-08-18 08:00:55dsvenssoncreate