classification
Title: bug in idna-encoding-module
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: loewis Nosy List: loewis, nnorwitz, rumpeltux
Priority: normal Keywords:

Created on 2004-03-03 18:13 by rumpeltux, last changed 2004-03-24 17:00 by loewis. This issue is now closed.

Messages (6)
msg20166 - (view) Author: Rumpeltux (rumpeltux) Date: 2004-03-03 18:13
in /usr/lib/python2.3/encodings/idna.py, line 175 it goes:
lables = input.split('.')
which causes the interpreter to stop executing the
program, but by changing it to
labels = dots.split(input)
everything's fine ;)
msg20167 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2004-03-05 19:52
Logged In: YES 
user_id=33168

Martin, it looks like line 174: unicode(input, "ascii")
should be input = unicode(input, "ascii").
I'm not sure what's supposed to happenning here, but it
looks like the if/else code block may be able to be
rewritten as:

if not isinstance(input, unicode):
    input = unicode(input, "ascii")
labels = dots.split(input)
msg20168 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-03-22 22:26
Logged In: YES 
user_id=21627

I can't see any problem in the code. The invocation of unicode() is 
correct - we just look for the exception that call may raise.

Rumpeltux, can you please report the exact input and exception you get?
msg20169 - (view) Author: Rumpeltux (rumpeltux) Date: 2004-03-23 16:17
Logged In: YES 
user_id=989758

>>> unicode('xn--mller-kva.de', 'idna')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/lib/python2.3/encodings/idna.py", line 175, in
decode
    labels = input.split(".")
AttributeError: 'buffer' object has no attribute 'split'
msg20170 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-03-24 16:40
Logged In: YES 
user_id=21627

This is now fixed in idna.py 1.4 and 1.2.12.2, by converting input to a 
string object. I leave this open to find out why there is a buffer object in 
the first place.
msg20171 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-03-24 17:00
Logged In: YES 
user_id=21627

Found it. PyUnicode_FromEncodedObject converts the string object to 
char*/len, then calls PyUnicode_Decode. This special-cases UTF-8, Latin
-1 and ASCII, then creates a buffer object and passes it to 
PyCodec_Decode.

Even if it might be possible to pass the string directly to the codec, the 
codec still has to deal with buffer objects, for direct callers of 
PyUnicode_Decode. So I leave the fix as-is, added a test-case 
(test_codecs.py 1.10), and close this as fixed.
History
Date User Action Args
2004-03-03 18:13:32rumpeltuxcreate