Title: readline interferes with characters beginning with byte \xe9
Messages (6)
msg132192 - (view) Author: Thomas Kluyver (takluyver) * Date: 2011-03-26 00:28
To replicate, in Python 3.1 on Linux (utf-8 console):

>>> print(chr(0x9000))

Copy and paste this character into the prompt. It appears correctly (as a Chinese character). Then:

>>> import readline
>>> readline.parse_and_bind('"\M-i":"    "')

Now try to paste the character again: it appears as "    ��"
 (four spaces, two unknown character symbols), and if you press return, you get a SyntaxError.

This happens with all characters beginning with \xe9: In UTF-8, that's 0x9000-0x9fff. If the terminal encoding is changed to cp1252, I'm told that the same thing can be achieved with é, which is \xe9 there.
msg141435 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2011-07-30 10:41
You're binding the M-i keyboard sequence. Could it be that the \xe9 byte is translated by the terminal to M-i, and that causes the interference? In this case, it's not really a bug.
msg175836 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-11-18 00:15
Original bug report:
msg175837 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-11-18 00:26
I confirm that the issue exists, but I don't think that it comes from Python. I bet that the readline library uses *byte* string, not *character* string, and so is unable to handle correctly multibyte characters like the chinese character U+9000.
msg175854 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-18 10:06
Yes, this is a readline issue.

Add '"\M-i":"    "' line to ~/.inputrc, run 'rlwrap cat' command, paste this multibyte character and you got the same result.

This is not a Python bug.
msg175954 - (view) Author: Thomas Kluyver (takluyver) * Date: 2012-11-19 10:45
OK, thanks, and sorry for the noise. I've closed this issue.

Looking at the readline manual, it looks like this is tied up with the options input-meta, output-meta and convert-meta. Fiddling around with .inputrc hasn't clarified exactly what they do, but it seems that the terminal can either handle unicode, or shortcuts involving meta (alt), but not both.
