classification
Title: faster utf-8 decoding
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: gregory.p.smith, pitrou, python-dev, vstinner
Priority: normal Keywords: patch

Created on 2011-11-16 22:49 by pitrou, last changed 2011-11-21 19:48 by pitrou. This issue is now closed.

Files
File name Uploaded Description Edit
utf8lib2.patch pitrou, 2011-11-16 22:49 review
Messages (4)
msg147778 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-11-16 22:49
PEP 393 and the need for a two-pass decoding process has made utf-8 decoding much slower, especially with the current generic implementation. Attached patch makes utf-8 more than twice faster, which means we're around 10-20% slower than 3.2 on non-trivial cases.
msg147926 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2011-11-19 02:35
+1 nice!  A couple minor comments on the code review.
msg148076 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-11-21 19:44
New changeset 8e6c4acaf530 by Antoine Pitrou in branch 'default':
Issue #13417: speed up utf-8 decoding by around 2x for the non-fully-ASCII case.
http://hg.python.org/cpython/rev/8e6c4acaf530
msg148078 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-11-21 19:48
Thanks for the review, patch committed now (with bogus comments removed).
History
Date User Action Args
2011-11-21 19:48:56pitrousetstatus: open -> closed
resolution: fixed
messages: + msg148078

stage: patch review -> resolved
2011-11-21 19:44:19python-devsetnosy: + python-dev
messages: + msg148076
2011-11-19 02:35:52gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg147926
2011-11-16 22:49:41pitroucreate