classification
Title: punycode codec ignores the error handler argument
Type: Stage:
Components: Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: loewis, r.david.murray, vstinner
Priority: normal Keywords: patch

Created on 2011-06-04 23:16 by vstinner, last changed 2013-12-17 01:52 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
punycode_errors.patch vstinner, 2011-06-15 22:10 review
Messages (8)
msg137666 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-06-04 23:16
b'abc\xff'.decode('punycode', 'ignore') raises the same error than b'abc\xff'.decode('punycode', 'replace') or b'abc\xff'.decode('punycode'): it uses the strict error handler, and simply ignores the error handler.

punycodec, as idna, should raise an exception if the error handler is different than strict.
msg137667 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-06-04 23:17
base64_codec.py uses an assertion to check the error handler: it should use an if+raise (assertions are ignored in optimized mode, python -O).
msg138401 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-06-15 22:10
punycode_errors.patch:
 - raise an error if errors="replace": only accept strict and ignore
 - use errors to encode/decode to/from ASCII
 - add unit tests

I don't think that the change should be documented, because punycode has no section in Doc/library/codecs.rst, and I hope that nobody uses something different than strict :-)
msg138498 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-06-17 10:55
@Martin: Can you review my patch?
msg138771 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-06-21 07:55
What's the point of disallowing the replace error handler? That's a slightly incompatible change, isn't it?
msg138773 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-06-21 09:38
Oh, I forgot to give a little bit more details.

b'abc\xff-'.decode('punycode', 'ignore') and b'abc\xff-'.decode('punycode', 'replace') raise a UnicodeDecodeError: the error handler is just useless (ignored) here.

With my patch, b'abc\xff-'.decode('punycode', 'ignore') gives 'abc'. (If I change the code to accept replace) b'abc\xff-'.decode('punycode', 'replace') gives also 'abc', but 'replace' doesn't work correctly in the part after "-" contain illegal byte sequences.

For example, b'a\xff-\xffb\xffga\xff'.decode("punycode", "replace") gives 'a�', whereas I would expect 'a�é' or 'aé�'. b'a-bga\xff'.decode("punycode", "replace") gives 'aé' as b'a-bga'.decode("punycode", "replace"), whereas I would expect 'aé�' or something like that.

> What's the point of disallowing the replace error handler?

It's just that I'm unable to patch punycode decoder to support the replace handler. Do you want to "implement" it?

> That's a slightly incompatible change, isn't it?

I don't think so because I consider that the punycode decoder never supported error handlers (other than strict) in Python 3.

What do you think?
msg138801 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-06-21 20:15
> Oh, I forgot to give a little bit more details.
> 
> b'abc\xff-'.decode('punycode', 'ignore') and
> b'abc\xff-'.decode('punycode', 'replace') raise a UnicodeDecodeError:
> the error handler is just useless (ignored) here.

That's not my point:

b"foo".decode("punycode","replace")

currently succeeds, but raises an UnicodeError under the patch.
msg206392 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-12-17 01:52
I prefer to not touch punycode right now, it works, there is no need to modify it.
History
Date User Action Args
2013-12-17 01:52:17vstinnersetstatus: open -> closed
resolution: wont fix
messages: + msg206392
2011-06-21 20:15:35loewissetmessages: + msg138801
2011-06-21 09:38:42vstinnersetmessages: + msg138773
2011-06-21 07:55:13loewissetmessages: + msg138771
2011-06-17 10:55:15vstinnersetmessages: + msg138498
2011-06-15 22:10:23vstinnersetfiles: + punycode_errors.patch

nosy: + loewis
messages: + msg138401

keywords: + patch
2011-06-05 13:28:52r.david.murraysetnosy: + r.david.murray
2011-06-04 23:17:40vstinnersetmessages: + msg137667
2011-06-04 23:16:17vstinnercreate