This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author laukpe
Recipients
Date 2007-08-12.22:54:08
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
A test using in format "chr(x) in <string>" raises a TypeError if "x" is in range 128-255 (i.e. non-ascii) and string is unicode. This happens even if the unicode string contains only ascii data as the example below demonstrates.

Python 2.5.1 (r251:54863, May  2 2007, 16:56:35) 
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> chr(127) in 'hello'
False
>>> chr(128) in 'hello'
False
>>> chr(127) in u'hi'
False
>>> chr(128) in u'hi'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'in <string>' requires string as left operand

This can cause pretty nasty and hard-to-debug bugs in code using "in <string>" format if e.g. user provided data is converted to unicode internally. Most other string operations work nicely between normal and unicode strings and I'd say simply returning False in this situation would be ok too. Issuing a warning similarly as below might be a good idea also.  

>>> chr(128) == u''
__main__:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal

Finally, the error message is somewhat misleading since the left operand is definitely a string.

>>> type(chr(128))
<type 'str'>

A real life example of code where this problem exist is telnetlib. I'll submit a separate bug about it as that problem can obviously be fixed in the library itself.
History
Date User Action Args
2007-08-23 14:59:11adminlinkissue1772788 messages
2007-08-23 14:59:11admincreate