Use _PyUnicodeWriter API in text decoders #60515

vstinner · 2012-10-24T18:38:21Z

BPO	16311
Nosy	@loewis, @vstinner, @serhiy-storchaka
Files	codecs_writer.patch codecs_writer_2.patch decodebench.res: Benchmark results

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2012-11-06.23:41:02.393>
created_at = <Date 2012-10-24.18:38:21.359>
labels = ['performance']
title = 'Use _PyUnicodeWriter API in text decoders'
updated_at = <Date 2012-11-07.22:53:54.106>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2012-11-07.22:53:54.106>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2012-11-06.23:41:02.393>
closer = 'python-dev'
components = []
creation = <Date 2012-10-24.18:38:21.359>
creator = 'vstinner'
dependencies = []
files = ['27697', '27807', '27808']
hgrepos = []
issue_num = 16311
keywords = ['patch']
message_count = 9.0
messages = ['173695', '173697', '174171', '174238', '174273', '174275', '174293', '175034', '175129']
nosy_count = 4.0
nosy_names = ['loewis', 'vstinner', 'python-dev', 'serhiy.storchaka']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue16311'
versions = ['Python 3.4']

vstinner · 2012-10-24T18:38:18Z

Attached patch modifies text decoders to use the _PyUnicodeWriter API to factorize the code. It removes unicode_widen() and unicode_putchar() functions.

Don't overallocate by default (except for "raw-unicode-escape" codec), enable overallocation on the first decode error (as done currently)
_PyUnicodeWriter_Prepare() only overallocates 25%, instead of 100%
for unicode_decode_call_errorhandler()
Use _PyUnicodeWriter_Prepare() + PyUnicode_WRITE() (two macros)
instead of unicode_putchar() (function)
_PyUnicodeWriter structures stores many useful fields, so we don't
have to pass multiple parameters to functions, only the writer

I wrote the patch to factorize the code, but it might be faster.

serhiy-storchaka · 2012-10-24T19:44:33Z

Soon I'll post a patch, which speeds up unicode-escape and raw-unicode-escape decoders to 1.5-3x. Also there are not yet reviewed patches for UTF-32 (bpo-14625) and charmap (bpo-14850) decoders. Will be merge conflicts.

But I will review the patch.

vstinner · 2012-10-30T01:02:32Z

"Soon I'll post a patch, which speeds up unicode-escape and raw-unicode-escape decoders to 1.5-3x. Also there are not yet reviewed patches for UTF-32 (bpo-14625) and charmap (bpo-14850) decoders. Will be merge conflicts."

codecs_writer.patch doesn't change too much the core of decoders, but mostly the code before and after the loop, and error handling. You can still use PyUnicode_WRITE, PyUnicode_READ, memcpy(), etc.

"But I will review the patch."

If you review the patch, please check that how the buffer is allocated. It should not be overallocated by default, only on the first error. Overallocation can kill performances when it is not necessary (especially on Windows).

serhiy-storchaka · 2012-10-30T23:17:21Z

I will do some experiments and review tomorrow.

serhiy-storchaka · 2012-10-31T12:50:03Z

I updated the patch to resolve the conflict with bpo-14625.

serhiy-storchaka · 2012-10-31T13:58:45Z

With the patch UTF-8 decoder 20% slower for some data. UTF-16 decoder 20% faster for some data and 20% slower for other data. UTF-32 decoder slower for many data (even after some optimization, naive code was up to 50% slower). Standard charmap decoder 10% slower. Only UTF-7, unicode-escape and raw-unicode-escape have become much faster (unicode-escape and raw-unicode-escape as with bpo-16334 patch).

A well optimized decoders do not benefit from the _PyUnicodeWriter, only a slight slowdown. The patch requires some optimization (as for UTF-32 decoder) to reduce the negative effect. Non-optimized decoders will receive the great benefit.

vstinner · 2012-10-31T15:30:41Z

I ran decodebench.py and bench-diff.py scripts from bpo-14624, I just
replaced repeat=10 with repeat=100 to get more reliable numbers. I
only see some performance regressions between -5% and -1%, but there
are some speedup on UTF-8 and UTF-32 (between +11% and +14%). On a
microbenchmark, numbers in the -10..10% range just means "no change".

Using _PyUnicodeWriter should not change anything to performances on
valid data, only performances of handling decoding errors between the
overallocation factor is different, the code to widen the buffer and
the code to write replacement characters.

python-dev · 2012-11-06T23:41:02Z

New changeset 7ed9993d53b4 by Victor Stinner in branch 'default':
Close bpo-16311: Use the _PyUnicodeWriter API in text decoders
http://hg.python.org/cpython/rev/7ed9993d53b4

vstinner · 2012-11-07T22:53:52Z

Oh, I forgot my benchmark results.

decodebench.py result results on Linux 32 bits:
(Linux-3.2.0-32-generic-pae-i686-with-debian-wheezy-sid)

$ ./python bench-diff.py original writer
ascii     'A'*10000                       4109 (-3%)    3974

latin1 'A'*10000 3851 (-5%) 3644
latin1 '\x80'*10000 14832 (-3%) 14430

utf-8 'A'*10000 3747 (-4%) 3608
utf-8 '\x80'*10000 976 (-2%) 961
utf-8 '\u0100'*10000 974 (-2%) 959
utf-8 '\u8000'*10000 804 (-14%) 694
utf-8 '\U00010000'*10000 666 (-5%) 635

utf-16le 'A'*10000 4154 (-1%) 4117
utf-16le '\x80'*10000 4055 (-2%) 3988
utf-16le '\u0100'*10000 4047 (-2%) 3974
utf-16le '\u8000'*10000 917 (-1%) 912
utf-16le '\U00010000'*10000 872 (-0%) 870

utf-16be 'A'*10000 3218 (-1%) 3185
utf-16be '\x80'*10000 3163 (-2%) 3114
utf-16be '\u0100'*10000 2591 (-1%) 2556
utf-16be '\u8000'*10000 979 (-1%) 974
utf-16be '\U00010000'*10000 928 (-0%) 925

utf-32le 'A'*10000 1681 (+12%) 1885
utf-32le '\x80'*10000 1697 (+10%) 1865
utf-32le '\u0100'*10000 2224 (+1%) 2254
utf-32le '\u8000'*10000 2224 (+2%) 2269
utf-32le '\U00010000'*10000 2234 (+1%) 2260

utf-32be 'A'*10000 1685 (+11%) 1868
utf-32be '\x80'*10000 1684 (+10%) 1860
utf-32be '\u0100'*10000 2223 (+1%) 2253
utf-32be '\u8000'*10000 2222 (+1%) 2255
utf-32be '\U00010000'*10000 2243 (+1%) 2257

decodebench.py result results on Linux 64 bits:
(Linux-3.4.9-2.fc16.x86_64-x86_64-with-fedora-16-Verne)

ascii 'A'*10000 10043 (+1%) 10144

latin1 'A'*10000 8351 (-1%) 8258
latin1 '\x80'*10000 19184 (+2%) 19560

utf-8 'A'*10000 8083 (+5%) 8461
utf-8 '\x80'*10000 982 (+1%) 993
utf-8 '\u0100'*10000 984 (+1%) 992
utf-8 '\u8000'*10000 806 (+31%) 1053
utf-8 '\U00010000'*10000 639 (+12%) 718

utf-16le 'A'*10000 5547 (-2%) 5422
utf-16le '\x80'*10000 5205 (+1%) 5271
utf-16le '\u0100'*10000 4900 (-4%) 4695
utf-16le '\u8000'*10000 1062 (+9%) 1154
utf-16le '\U00010000'*10000 1040 (+4%) 1078

utf-16be 'A'*10000 5416 (-5%) 5157
utf-16be '\x80'*10000 5077 (-1%) 5011
utf-16be '\u0100'*10000 4261 (-1%) 4218
utf-16be '\u8000'*10000 1146 (+0%) 1147
utf-16be '\U00010000'*10000 1125 (-1%) 1119

utf-32le 'A'*10000 1743 (+8%) 1880
utf-32le '\x80'*10000 1751 (+5%) 1842
utf-32le '\u0100'*10000 2114 (+29%) 2721
utf-32le '\u8000'*10000 2120 (+28%) 2718
utf-32le '\U00010000'*10000 2065 (+30%) 2690

utf-32be 'A'*10000 1761 (+6%) 1860
utf-32be '\x80'*10000 1749 (+6%) 1856
utf-32be '\u0100'*10000 2101 (+29%) 2715
utf-32be '\u8000'*10000 2083 (+30%) 2715
utf-32be '\U00010000'*10000 2058 (+31%) 2689

Most significant changes:

-14% to decode '\u8000'*10000 from UTF-8 on Linux 32 bits
+31% to decode '\u8000'*10000 from UTF-8 on Linux 32 bits
+28% to +31% to decode UCS-2 and UCS-4 characters from UTF-8 on Linux 32 bits

@serhiy Storchaka: If you feel able to tune _PyUnicodeWriter to
improve its performance, please open a new issue.

I consider the performance changes acceptable and I don't plan to work
on this topic.

vstinner added the performance Performance or resource usage label Oct 24, 2012

python-dev mannequin closed this as completed Nov 6, 2012

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use _PyUnicodeWriter API in text decoders #60515

Use _PyUnicodeWriter API in text decoders #60515

vstinner commented Oct 24, 2012

vstinner commented Oct 24, 2012

serhiy-storchaka commented Oct 24, 2012

vstinner commented Oct 30, 2012

serhiy-storchaka commented Oct 30, 2012

serhiy-storchaka commented Oct 31, 2012

serhiy-storchaka commented Oct 31, 2012

vstinner commented Oct 31, 2012

python-dev mannequin commented Nov 6, 2012

vstinner commented Nov 7, 2012

Use _PyUnicodeWriter API in text decoders #60515

Use _PyUnicodeWriter API in text decoders #60515

Comments

vstinner commented Oct 24, 2012

vstinner commented Oct 24, 2012

serhiy-storchaka commented Oct 24, 2012

vstinner commented Oct 30, 2012

serhiy-storchaka commented Oct 30, 2012

serhiy-storchaka commented Oct 31, 2012

serhiy-storchaka commented Oct 31, 2012

vstinner commented Oct 31, 2012

python-dev mannequin commented Nov 6, 2012

vstinner commented Nov 7, 2012