This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients pitrou, serhiy.storchaka, vstinner
Date 2013-04-03.21:29:12
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1365024552.84.0.0461498193244.issue17628@psf.upfronthosting.co.za>
In-reply-to
Content
In Python 3.4, str==str is implemented by calling memcmp().

unicode_eq() function, used by dict and set types, checks the first byte before calling memcmp(). bytes==bytes uses the same check.

Py_UNICODE_MATCH macro checks the first *and* last character before calling memcmp() since this commit:
---
changeset:   38242:0de9a789de39
branch:      legacy-trunk
user:        Fredrik Lundh <fredrik@pythonware.com>
date:        Tue May 23 10:10:57 2006 +0000
files:       Include/unicodeobject.h
description:
needforspeed: check first *and* last character before doing a full memcmp
---

Attached patch changes str==str to check the first and last character before calling memcmp(). It might reduce the overhead of a C function call, but it is much faster when comparing two different strings of the same length with a common prefix (but a different suffix).

The patch merges also unicode_compare_eq() and unicode_eq() to use the same code for str, dict and set.

We may use the same optimization on byte strings.

See also #16321.
History
Date User Action Args
2013-04-03 21:29:12vstinnersetrecipients: + vstinner, pitrou, serhiy.storchaka
2013-04-03 21:29:12vstinnersetmessageid: <1365024552.84.0.0461498193244.issue17628@psf.upfronthosting.co.za>
2013-04-03 21:29:12vstinnerlinkissue17628 messages
2013-04-03 21:29:12vstinnercreate