Issue2246
Created on 2008-03-06 19:39 by asmodai, last changed 2008-03-06 22:53 by rhettinger.
| File name |
Uploaded |
Description |
Edit |
Remove |
|
testcase.py
|
asmodai,
2008-03-06 19:39
|
Testcase code |
|
|
|
groupby-leak.diff
|
belopolsky,
2008-03-06 21:21
|
|
|
|
| msg63332 (view) |
Author: Jeroen Ruigrok van der Werven (asmodai) |
Date: 2008-03-06 19:39 |
|
Quoting from my email to Raymond:
In the Trac/Genshi community we've been tracking a bit obscure memory
leak that causes us a lot of problems.
Please see http://trac.edgewall.org/ticket/6614 and then
http://genshi.edgewall.org/ticket/190 for background.
We reduced the case to the following Python only code and believe it is
a bug within itertool's groupby. As Armin Ronacher explains in Genshi
ticket 190:
"Looks like genshi is not to blame. itertools.groupby has a grouper
with a reference to the groupby type but no traverse func. As soon as a
circular reference ends up in the groupby (which happens thanks to the
func_globals in our lambda) genshi leaks."
This can be demonstrated with the following code (testcase attachment
present with this issue):
import gc
from itertools import groupby
def run():
keyfunc = lambda x: x
for i, j in groupby(range(100), key=keyfunc):
keyfunc.x = j
for x in xrange(20):
gc.collect()
run()
print len(gc.get_objects())
On executing this in will show numerical output of the garbage
collector, but every iteration will be +4 from the previous, as Armin
specifies:
"a frame, a grouper, a keyfunc and a groupby object"
We have been unable to come up with a decent patch and thus I am
logging this issue now.
|
| msg63335 (view) |
Author: Alexander Belopolsky (belopolsky) |
Date: 2008-03-06 20:48 |
|
With the following patch:
===================================================================
--- Lib/test/test_itertools.py (revision 61284)
+++ Lib/test/test_itertools.py (working copy)
@@ -707,6 +707,12 @@
a = []
self.makecycle(takewhile(bool, [1, 0, a, a]), a)
+ def test_issue2246(self):
+ n = 10
+ keyfunc = lambda x: x
+ for i, j in groupby(xrange(n), key=keyfunc):
+ keyfunc.__dict__.setdefault('x',[]).append(j)
+
def R(seqn):
'Regular generator'
for i in seqn:
$ ./python Lib/test/regrtest.py -R :: test_itertools
reports n*3 + 13 reference leaks. This should give a clue ...
|
| msg63336 (view) |
Author: Alexander Belopolsky (belopolsky) |
Date: 2008-03-06 21:05 |
|
It looks like the problem is that the internal grouper object becomes a
part of a cycle: keyfunc -> grouper(x) -> keyfunc(tgtkey), but its type
does not support GC. I will try to come up with a patch.
|
| msg63337 (view) |
Author: Raymond Hettinger (rhettinger) |
Date: 2008-03-06 21:07 |
|
No need. I'm already working on adding GC to the grouper.
|
| msg63338 (view) |
Author: Alexander Belopolsky (belopolsky) |
Date: 2008-03-06 21:21 |
|
Oops. Here is my patch anyways.
|
| msg63339 (view) |
Author: Paul Pogonyshev (_doublep) |
Date: 2008-03-06 21:32 |
|
Damn, I wrote a patch too ;)
|
| msg63340 (view) |
Author: Raymond Hettinger (rhettinger) |
Date: 2008-03-06 22:53 |
|
r61286. Applied a patch substantially similar to Alexanders. Thanks
for the test case and the report.
|
|
| Date |
User |
Action |
Args |
| 2008-03-06 22:53:11 | rhettinger | set | status: open -> closed resolution: fixed messages:
+ msg63340 |
| 2008-03-06 21:32:20 | _doublep | set | nosy:
+ _doublep messages:
+ msg63339 |
| 2008-03-06 21:21:47 | belopolsky | set | files:
+ groupby-leak.diff keywords:
+ patch messages:
+ msg63338 |
| 2008-03-06 21:07:18 | rhettinger | set | assignee: rhettinger messages:
+ msg63337 |
| 2008-03-06 21:05:44 | belopolsky | set | messages:
+ msg63336 |
| 2008-03-06 20:53:49 | aronacher | set | nosy:
+ aronacher |
| 2008-03-06 20:48:09 | belopolsky | set | nosy:
+ belopolsky messages:
+ msg63335 |
| 2008-03-06 19:39:28 | asmodai | create | |
|