Author martin.panter
Recipients martin.panter, xdegaye
Date 2017-01-16.02:07:46
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1484532468.66.0.943713194332.issue28997@psf.upfronthosting.co.za>
In-reply-to
Content
So the problem seems to be that Python assumes Readline’s encoding is UTF-8, but Readline actually uses ASCII (depending on locale variables). The code at the start of the test is supposed to catch when add_history() calls PyUnicode_EncodeLocale() and fails.

I don’t understand the details of UTF-8 vs locale on Android, but maybe we could adjust the encode() and decode() implementations in Modules/readline.c, to account for the Readline library’s idea of the locale encoding. Or maybe we could adjust the temporary setlocale() calls in Modules/readline.c.

If you are happy to declare the Readline library is broken on Android, I now think I would prefer to skip the test based on support.is_android, rather than the previous patches. Otherwise, we risk masking genuine test failures on other platforms. Something like:

@unittest.skipIf(is_android,
    "Gnu Readline disagrees about the locale encoding on Android")
def test_nonascii(self):
    try:
        readline.add_history("\xEB\xEF")
    ...

When you run “LANG= bash”, it is only Bash and Readline that gets the C locale; the terminal is unchanged. I presume the terminal inputs é as two UTF-8 bytes, but Readline with the C locale is not aware of UTF-8, and assumes the two bytes are two separate characters.
History
Date User Action Args
2017-01-16 02:07:48martin.pantersetrecipients: + martin.panter, xdegaye
2017-01-16 02:07:48martin.pantersetmessageid: <1484532468.66.0.943713194332.issue28997@psf.upfronthosting.co.za>
2017-01-16 02:07:48martin.panterlinkissue28997 messages
2017-01-16 02:07:46martin.pantercreate