Author amaury.forgeotdarc
Recipients amaury.forgeotdarc, belopolsky, eisele, loewis, rhettinger
Date 2008-04-11.08:44:35
SpamBayes Score 0.128388
Marked as misclassified No
Message-id <>
The slowdown is because of the garbage collector, which has more and
more objects to traverse (the tuples).
If I add "import gc; gc.disable()" at the beginning of your script, it
runs much faster, and the timings look linear.

Martin's sample is not affected, because there are very few
deallocations, and the gc collection is not triggered.

Disabling the gc may not be a good idea in a real application; I suggest
you to play with the gc.set_threshold function and set larger values, at
least while building the dictionary. (700, 1000, 10) seems to yield good
Date User Action Args
2008-04-11 08:44:38amaury.forgeotdarcsetspambayes_score: 0.128388 -> 0.128388
recipients: + amaury.forgeotdarc, loewis, rhettinger, belopolsky, eisele
2008-04-11 08:44:37amaury.forgeotdarcsetspambayes_score: 0.128388 -> 0.128388
messageid: <>
2008-04-11 08:44:36amaury.forgeotdarclinkissue2607 messages
2008-04-11 08:44:35amaury.forgeotdarccreate