classification
Title: garbage collector blocks and takes worst-case linear time wrt number of objects
Type: resource usage Stage:
Components: Interpreter Core Versions: Python 3.0, Python 2.4, Python 2.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder:
Assigned To: Nosy List: LambertDW, benjamin.peterson, darrenr, loewis
Priority: normal Keywords:

Created on 2008-12-31 19:34 by darrenr, last changed 2009-01-06 18:36 by darrenr. This issue is now closed.

Files
File name Uploaded Description Edit
gctimings.zip darrenr, 2008-12-31 19:34 garbage collection timings
Messages (9)
msg78646 - (view) Author: (darrenr) Date: 2008-12-31 19:34
Python's garbage collector holds GIL during collection and doesn't
provide any method of interruption or concurrency with other Python
threads within a single Python VM. This can be a problem for realtime
applications. The worst-case performance of the garbage collector takes
linear time with respect to the number of Python objects that could
potentially be involved in a garbage cycle. I've attached timings taken
on a Core 2 Quad 2.4 GHz (WinXP Pro, 3GB RAM), with ever-increasing
numbers of objects. The gc at worst takes upwards of 3 seconds before
the process runs out of memory.

If gc periodically released the GIL it would allow it to be put in a
separate thread, but as it stands it blocks the Python VM for periods of
time that are too long for realtime interactive applications.
Alternatively a gc that is incremental by default would eliminate the
need for a second thread.
msg78649 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-12-31 19:58
The garbage collector will never be able to run in a second thread
because it manipulates Python objects, which the GIL is supposed to protect!

As for non-linear complexity, see #4688 and #4074 for some attempts to
sooth this problem over.
msg78662 - (view) Author: (darrenr) Date: 2008-12-31 22:52
A 'stop-the-world' garbage collector that periodically released the GIL
could be run in a second thread, allowing the main thread to break in
and do some processing. However the nature of a stop-the-world collector
means that it probably would not easily be able to deal with changes
made by other threads in the middle of the collect.

My concern is that the Python process blocks and is unresponsive due to
garbage collection for periods of time that are not acceptable for
realtime interactive applications. Are there any plans to add an
incremental collector to Python?
msg78825 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-01-02 16:01
Hard real-time applications written in Python should not rely on the
cyclic garbage collector. They should call gc.disable at startup, and
completely rely on reference counting for releasing memory. Doing so
might require rewrites to the application, explicitly breaking cycles so
that reference counting can release them.

Even with cyclic gc disabled, applications worried about meeting hard
deadlines need to look even more into memory allocation and
deallocation; e.g. releasing a single object may cause a chained release
of many objects, which can affect worst case execution times. There are
more issues to consider (which are all out of scope of the bug tracker).
msg78926 - (view) Author: (darrenr) Date: 2009-01-03 02:44
OK cool, that's the development strategy we've already adopted. Is this
limitation of Python's garbage collector in relation to real-time
applications documented anywhere?
msg78935 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-01-03 07:36
> OK cool, that's the development strategy we've already adopted. Is this
> limitation of Python's garbage collector in relation to real-time
> applications documented anywhere?

Why do you ask? (this is OT for the bug tracker)
It's not in the Python documentation. However, any good
book on real-time systems will tell you that garbage collection
is a problem.
msg79186 - (view) Author: (darrenr) Date: 2009-01-05 19:14
I ask because in my opinion a three-second pause on a modern machine is
significant for any program with any sort of interactivity--significant
enough to warrant a warning in the documentation. Python is a great
language and I think it deserves an incremental implementation of
garbage collection.
msg79189 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-01-05 19:34
> Python is a great
> language and I think it deserves an incremental implementation of
> garbage collection.

Python's cyclic garbage collector is incremental. If you can provide
a specific patch to replace it with something "better", please submit
it as a separate issue to the tracker. If you can hire somebody to
implement such a thing, go right ahead. Otherwise, I think there is
little that can be done. Python gets all of its contributions from
volunteers.
msg79277 - (view) Author: (darrenr) Date: 2009-01-06 18:36
Regardless of the type of algorithm used by the garbage collector that
currently exists in Python, its worst-case performance is undesirable. I
have some interest in implementing a garbage collector for Python with
improved performance characteristics but don't really have the time for
it. Hopefully someone does. In any case thanks for hearing me out.
History
Date User Action Args
2009-01-06 18:36:01darrenrsetmessages: + msg79277
2009-01-05 19:34:58loewissetmessages: + msg79189
2009-01-05 19:14:44darrenrsetmessages: + msg79186
2009-01-03 07:36:12loewissetmessages: + msg78935
2009-01-03 02:44:57darrenrsetmessages: + msg78926
2009-01-02 16:01:39loewissetnosy: + loewis
messages: + msg78825
2008-12-31 22:52:01darrenrsetmessages: + msg78662
2008-12-31 19:58:42benjamin.petersonsetstatus: open -> closed
resolution: duplicate
messages: + msg78649
nosy: + benjamin.peterson
2008-12-31 19:55:16LambertDWsetnosy: + LambertDW
2008-12-31 19:35:18darrenrsettype: resource usage
components: + Interpreter Core
2008-12-31 19:34:45darrenrcreate