Message 193914 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	scoder
Recipients	scoder, serhiy.storchaka
Date	2013-07-30.06:49:19
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1375166960.42.0.521563124847.issue18594@psf.upfronthosting.co.za>
In-reply-to

Content
The C accelerator for the collections.Counter class (_count_elements() in _collections.c) is slower than the pure Python versions for data that has many unique entries. This is because the fast path for dicts is not taken (Counter is a subtype of dict) and the slower fallback path raises exceptions for each value that wasn't previously seen. This can apparently make it slower than calling get() on Python side. My suggestion is to drop the fallback path from the accelerator completely and to only call the C function when it's safe to use it, e.g. when "type(self) is Counter" and not a subclass.

The C accelerator for the collections.Counter class (_count_elements() in _collections.c) is slower than the pure Python versions for data that has many unique entries. This is because the fast path for dicts is not taken (Counter is a subtype of dict) and the slower fallback path raises exceptions for each value that wasn't previously seen. This can apparently make it slower than calling get() on Python side.

My suggestion is to drop the fallback path from the accelerator completely and to only call the C function when it's safe to use it, e.g. when "type(self) is Counter" and not a subclass.

History
Date	User	Action	Args
2013-07-30 06:49:20	scoder	set	recipients: + scoder, serhiy.storchaka
2013-07-30 06:49:20	scoder	set	messageid: <1375166960.42.0.521563124847.issue18594@psf.upfronthosting.co.za>
2013-07-30 06:49:20	scoder	link	issue18594 messages
2013-07-30 06:49:19	scoder	create