Title: bug in idna-encoding-module
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: loewis Nosy List: loewis, nnorwitz, rumpeltux
Priority: normal Keywords:

Created on 2004-03-03 18:13 by rumpeltux, last changed 2004-03-24 17:00 by loewis. This issue is now closed.

Messages (6)
msg20166 - (view) Author: Rumpeltux (rumpeltux) Date: 2004-03-03 18:13
in /usr/lib/python2.3/encodings/, line 175 it goes:
lables = input.split('.')
which causes the interpreter to stop executing the
program, but by changing it to
labels = dots.split(input)
everything's fine ;)
msg20167 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2004-03-05 19:52
Logged In: YES 

Martin, it looks like line 174: unicode(input, "ascii")
should be input = unicode(input, "ascii").
I'm not sure what's supposed to happenning here, but it
looks like the if/else code block may be able to be
rewritten as:

if not isinstance(input, unicode):
    input = unicode(input, "ascii")
labels = dots.split(input)
msg20168 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-03-22 22:26
Logged In: YES 

I can't see any problem in the code. The invocation of unicode() is 
correct - we just look for the exception that call may raise.

Rumpeltux, can you please report the exact input and exception you get?
msg20169 - (view) Author: Rumpeltux (rumpeltux) Date: 2004-03-23 16:17
Logged In: YES 

>>> unicode('', 'idna')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/lib/python2.3/encodings/", line 175, in
    labels = input.split(".")
AttributeError: 'buffer' object has no attribute 'split'
msg20170 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-03-24 16:40
Logged In: YES 

This is now fixed in 1.4 and, by converting input to a 
string object. I leave this open to find out why there is a buffer object in 
the first place.
msg20171 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-03-24 17:00
Logged In: YES 

Found it. PyUnicode_FromEncodedObject converts the string object to 
char*/len, then calls PyUnicode_Decode. This special-cases UTF-8, Latin
-1 and ASCII, then creates a buffer object and passes it to 

Even if it might be possible to pass the string directly to the codec, the 
codec still has to deal with buffer objects, for direct callers of 
PyUnicode_Decode. So I leave the fix as-is, added a test-case 
( 1.10), and close this as fixed.
Date User Action Args
2004-03-03 18:13:32rumpeltuxcreate