This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Optimize finding the max character width
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: pitrou, python-dev, vstinner
Priority: normal Keywords: patch

Created on 2011-10-11 23:14 by pitrou, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
find_max_char.patch pitrou, 2011-10-11 23:14 review
find_max_char2.patch pitrou, 2011-10-11 23:22 review
find_max_char3.patch pitrou, 2011-10-11 23:32 review
find_max_char4.patch pitrou, 2011-10-11 23:50 review
find_max_char5.patch vstinner, 2011-10-12 21:10
Messages (12)
msg145372 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-11 23:14
This patch optimizes scanning for the max character width in an unicode buffer.

Micro-benchmarking some worst case situations:

$ ./python -m timeit -s "x='é'+'x'*100000" "x[1:]"
-> before:
10000 loops, best of 3: 74.9 usec per loop
-> after:
100000 loops, best of 3: 15.5 usec per loop

$ ./python -m timeit -s "x='\U00010000'+'x'*100000" "x[1:]"
-> before:
10000 loops, best of 3: 138 usec per loop
-> after:
10000 loops, best of 3: 71.3 usec per loop
msg145373 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-11 23:22
I hadn't noticed the STRINGLIB_SIZEOF_CHAR constant. Reuse it instead of adding STRINGLIB_CHAR_SIZE.
msg145374 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-11 23:32
Slightly cleaned up patch after Victor's comments in private.
msg145375 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-10-11 23:42
find_max_char() returns 0x10000 instead of 0x10FFFF, which may be wrong (or at least, surprising). You may add a max_char variable using other macros like MAX_CHAR_ASCII, MAX_CHAR_UCS1, ..., which will be set at the same time than mask. Or restore your if (ret >= 0x10000) return 0x10ffff; else return ret;.

Constants look inconsistent:

+    const STRINGLIB_CHAR *unrolled_end = begin + (n & ~ (Py_ssize_t) 7);
+        p += 4;

You may use STRINGLIB_CHAR in:

+        Py_UCS4 bits = p[0] | p[1] | p[2] | p[3];
msg145376 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-11 23:50
Ok, updated patch.
msg145377 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-10-12 00:11
> Ok, updated patch.

"ret = ~mask + 1;" looks wrong: (~0xFFFFFF80+1) gives 128, not 127. I don't see why you need:

+    if (ret < 128)
+        return 127;
+    if (ret < 256)
+        return 255;

#undef ASCII_CHAR_MASK should be #undef UCS1_ASCII_CHAR_MASK


#error Invalid STRINGLIB_SIZEOF_CHAR (must be 1, 2 or 4)
should be
#error Invalid STRINGLIB_SIZEOF_CHAR (must be 2 or 4)

Why do you need these forward declarations? It's maybe related to another patch?

 static PyObject *
+unicode_fromascii(const unsigned char *s, Py_ssize_t size);
+static PyObject *
+_PyUnicode_FromUCS1(const unsigned char *s, Py_ssize_t size);
+static PyObject *
+_PyUnicode_FromUCS2(const Py_UCS2 *s, Py_ssize_t size);
+static PyObject *
+_PyUnicode_FromUCS4(const Py_UCS4 *s, Py_ssize_t size);

(You kept Py_UCS4 for "Py_UCS4 bits", but Py_UCS4 is maybe just fine.)

By the way, your function rocks :-) Use bit masks is a great idea, especially your "bits = p[0] | p[1] | p[2] | p[3]" "hack".
msg145378 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-12 00:24
> > Ok, updated patch.
> 
> "ret = ~mask + 1;" looks wrong: (~0xFFFFFF80+1) gives 128, not 127.

That's on purpose, since the mask has just matched. If 0xFFFFFF80
matches, then the max char can't be 127, it has to be at least 128.

>  I don't see why you need:
> 
> +    if (ret < 128)
> +        return 127;
> +    if (ret < 256)
> +        return 255;

Because otherwise you're returning 0xFFFF in the following line:

    if (ret < 0x10000)
        return 0xFFFF;

> #undef ASCII_CHAR_MASK should be #undef UCS1_ASCII_CHAR_MASK

Ha, funny that gcc doesn't say anything.
I also forgot to #under MASK_*.

> #error Invalid STRINGLIB_SIZEOF_CHAR (must be 1, 2 or 4)
> should be
> #error Invalid STRINGLIB_SIZEOF_CHAR (must be 2 or 4)

Well, no. 1 is a legal value, even though it's handled elsewhere, so the
error message is right.

> Why do you need these forward declarations? It's maybe related to another patch?

It's because I'm moving the stringlib #includes up in unicodeobject.c,
and stringlib calls these functions.
msg145427 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-10-12 20:21
Without the patch:

python3.2 -m timeit 'x="é"+"x"*10000' 'x[1:]'
100000 loops, best of 3: 2.18 usec per loop
                                                                                     
python3.3 -m timeit 'x="é"+"x"*10000' 'x[1:]'
100000 loops, best of 3: 6.68 usec per loop
msg145428 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-10-12 20:22
With find_max_char4.patch:

python3.3 -m timeit 'x="é"+"x"*10000' 'x[1:]'
100000 loops, best of 3: 1.96 usec per loop
msg145432 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-10-12 21:10
find_max_char5.patch:
 - don't use adjusted ~mask+1: "precompute" the right max_char
 - rename findwidth.h to find_max_char.h
 - add some #undef
msg145437 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-10-12 22:08
New changeset 9c7d3207fc15 by Antoine Pitrou in branch 'default':
Issue #13155: Optimize finding the optimal character width of an unicode string
http://hg.python.org/cpython/rev/9c7d3207fc15
msg145438 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-12 22:08
> find_max_char5.patch:
> - don't use adjusted ~mask+1: "precompute" the right max_char
> - rename findwidth.h to find_max_char.h
> - add some #undef

Thank you, I've committed this version.
History
Date User Action Args
2022-04-11 14:57:22adminsetgithub: 57364
2011-10-12 22:08:58pitrousetstatus: open -> closed
resolution: fixed
messages: + msg145438

stage: resolved
2011-10-12 22:08:07python-devsetnosy: + python-dev
messages: + msg145437
2011-10-12 21:10:12vstinnersetfiles: + find_max_char5.patch

messages: + msg145432
2011-10-12 20:22:36vstinnersetmessages: + msg145428
2011-10-12 20:21:30vstinnersetmessages: + msg145427
2011-10-12 00:24:31pitrousetmessages: + msg145378
2011-10-12 00:11:44vstinnersetmessages: + msg145377
2011-10-11 23:50:38pitrousetfiles: + find_max_char4.patch

messages: + msg145376
2011-10-11 23:42:17vstinnersetnosy: + vstinner
messages: + msg145375
2011-10-11 23:32:19pitrousetfiles: + find_max_char3.patch

messages: + msg145374
2011-10-11 23:22:36pitrousetfiles: + find_max_char2.patch

messages: + msg145373
2011-10-11 23:14:18pitroucreate