This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Strange time complexity when creating nested lists
Type: performance Stage:
Components: None Versions: Python 2.6
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: Nosy List: pitrou, quakes, r.david.murray
Priority: normal Keywords:

Created on 2011-11-05 23:56 by quakes, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (4)
msg147126 - (view) Author: Jonas LM (quakes) Date: 2011-11-05 23:56
Consider the following snippets:

def lists(n):
    start_time = time.time()
    lists = [None]*n
    for i in xrange(n):
            lists[i] = [None]*n
            for j in xrange(n):
                    lists[i][j] = []
    print time.time() - start_time

def simple_lists(n):
    start_time = time.time()
    lists = [None]*n
    for i in xrange(n):
            lists[i] = [None]*n
            for j in xrange(n):
                    lists[i][j] = False
    print time.time() - start_time

Both of these snippets seem like they should run in O(n^2), right?

Observe the following test runs:

>>> for i in [4000, 8000, 16000]: simple_lists(i)
2.11170578003
8.67467808723
34.0958559513
>>> for i in [1000, 2000, 4000]: lists(i)
1.13742399216
7.39806008339
78.0808939934

While simple_lists() seem to run roughly in O(n^2), it seems like lists() runs in upwards of O(n^3) or worse! Something funky is going on, and I have no idea what.
msg147127 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-11-06 00:12
In the case of 'lists' an object is being allocated each time through the inner loop.  In the case of simple_lists, no such allocation is being done.  Your timing issues are probably related to the memory allocation behavior of your system.
msg147130 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-11-06 01:09
It's because of the cyclic garbage collector. If you call gc.disable() at the beginning of your benchmark, you'll see that runtimes get more similar in both cases.
You can also use tuples instead of lists as much as possible, it will reduce pressure on the GC.
msg147133 - (view) Author: Jonas LM (quakes) Date: 2011-11-06 02:05
Confirmed that this was caused by the garbage collector, as pitrou suspected. Thanks!
History
Date User Action Args
2022-04-11 14:57:23adminsetgithub: 57560
2011-11-06 02:05:39quakessetstatus: open -> closed
resolution: works for me
messages: + msg147133
2011-11-06 01:09:54pitrousetnosy: + pitrou
messages: + msg147130
2011-11-06 00:12:10r.david.murraysetnosy: + r.david.murray
messages: + msg147127
2011-11-05 23:56:41quakescreate