This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients ben.knight, eryksun, serhiy.storchaka, vstinner
Date 2016-03-01.19:54:11
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1456862052.29.0.954936715313.issue26464@psf.upfronthosting.co.za>
In-reply-to
Content
Oh... I see. It's a bug introduced by the optimization for ASCII replacing one character with another ASCII character or deleting a character: unicode_fast_translate(). See change cca6b056236a of issue #21118.

There is a confusion in the code between input and ouput position. "i = writer.pos;" is used in the caller to continue when unicode_fast_translate() was interrupted (because a translation use a non-ASCII character or a string longer than 1 character), but writer.pos is the position in the *output* string, not in the *input* string :-/

I see that I added unit tests on translate, but it lacks an unit testing fast translation, starting with ignore and then switching to regular translation.

Attached patch should fix the issue. It adds unit tests.
History
Date User Action Args
2016-03-01 19:54:12vstinnersetrecipients: + vstinner, serhiy.storchaka, eryksun, ben.knight
2016-03-01 19:54:12vstinnersetmessageid: <1456862052.29.0.954936715313.issue26464@psf.upfronthosting.co.za>
2016-03-01 19:54:12vstinnerlinkissue26464 messages
2016-03-01 19:54:12vstinnercreate