Message 65293 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	eisele
Recipients	eisele
Date	2008-04-10.16:21:39
SpamBayes Score	0.03583613
Marked as misclassified	No
Message-id	<1207844501.86.0.649423716831.issue2607@psf.upfronthosting.co.za>
In-reply-to

Content
I need to count pairs of strings, and I use a defaultdict in a construct like count[a,b] += 1 I am able to count 50K items per second on a very fast machine, which is way too slow for my application. If I count complete strings like count[ab] += 1 it can count 500K items/second, which is more reasonable. I don't see why there is a performance penalty of a factor of 10 for such a simple construct. Do I have to switch to Perl or C to get this done??? Thanks a lot for any insight on this. Best regards, Andreas PS.: The problem seems to exist for ordinary dicts as well, it is not related to the fact that I use a defaultdict PPS: I also tried nested defaultdicts count[a][b] += 1 and get the same slow speed (and 50% more memory consumption)

I need to count pairs of strings, and I use 
a defaultdict in a construct like

count[a,b] += 1

I am able to count 50K items per second on a very fast machine,
which is way too slow for my application.

If I count complete strings like

count[ab] += 1

it can count 500K items/second, which is more reasonable.

I don't see why there is a performance penalty of a factor
of 10 for such a simple construct.

Do I have to switch to Perl or C to get this done???

Thanks a lot for any insight on this.

Best regards,
Andreas

PS.: The problem seems to exist for ordinary
dicts as well, it is not related to the fact that
I use a defaultdict

PPS: I also tried nested defaultdicts
count[a][b] += 1
and get the same slow speed (and 50% more memory consumption)

History
Date	User	Action	Args
2008-04-10 16:21:42	eisele	set	spambayes_score: 0.0358361 -> 0.03583613 recipients: + eisele
2008-04-10 16:21:41	eisele	set	spambayes_score: 0.0358361 -> 0.0358361 messageid: <1207844501.86.0.649423716831.issue2607@psf.upfronthosting.co.za>
2008-04-10 16:21:40	eisele	link	issue2607 messages
2008-04-10 16:21:39	eisele	create