Message 146508 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	RichIsMyName
Recipients	RichIsMyName, asmodai, loewis, pitrou, scoder
Date	2011-10-27.17:52:40
SpamBayes Score	0.00013878605
Marked as misclassified	No
Message-id	<1319737961.76.0.990787408068.issue13279@psf.upfronthosting.co.za>
In-reply-to

Content
In discussions of memcmp performance, (http://www.picklingtools.com/study.pdf) it was noted how well Python 2.7 can take advantage of faster memcmps (indeed, the rich comparisons are all memcmp calls). There have been some discussion on python-dev@python.org as well. With unicode and Python 3.3 (and anyPython 3.x) there are a few places we could call memcmp to make string comparisons faster, but they are not completely trivial. Basically, if the unicode strings are "1 byte kind", then memcmp can be used almost as is. If the unicode strings are the same kind, they can at least use memcmp to compare for equality or inequality. There is also a minor optimization laying in unicode_compare: if you are comparing two strings for equality/inequality, there is no reason to look at the entire string if the lengths are different. These 3 minor optimizations can make unicode_compare faster.

In discussions of memcmp performance, (http://www.picklingtools.com/study.pdf)
it was noted how well Python 2.7 can take advantage of faster memcmps (indeed, the rich comparisons are all memcmp calls).
There have been some discussion on python-dev@python.org as well.

With unicode and Python 3.3 (and anyPython 3.x) there are a 
few places we could call memcmp to make string comparisons faster, but they are not completely trivial.

Basically, if the unicode strings are "1 byte kind", then memcmp can be used almost as is.  If the unicode strings are the same kind, they can at least use memcmp to compare for equality or inequality.

There is also a minor optimization laying in unicode_compare: if you
are comparing two strings for equality/inequality, there is no reason to look at the entire string if the lengths are different.

These 3 minor optimizations can make unicode_compare faster.

History
Date	User	Action	Args
2011-10-27 17:52:41	RichIsMyName	set	recipients: + RichIsMyName, loewis, pitrou, scoder, asmodai
2011-10-27 17:52:41	RichIsMyName	set	messageid: <1319737961.76.0.990787408068.issue13279@psf.upfronthosting.co.za>
2011-10-27 17:52:41	RichIsMyName	link	issue13279 messages
2011-10-27 17:52:40	RichIsMyName	create