Message 285533 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	martin.panter
Recipients	martin.panter, xdegaye
Date	2017-01-16.02:07:46
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1484532468.66.0.943713194332.issue28997@psf.upfronthosting.co.za>
In-reply-to

Content
So the problem seems to be that Python assumes Readline’s encoding is UTF-8, but Readline actually uses ASCII (depending on locale variables). The code at the start of the test is supposed to catch when add_history() calls PyUnicode_EncodeLocale() and fails. I don’t understand the details of UTF-8 vs locale on Android, but maybe we could adjust the encode() and decode() implementations in Modules/readline.c, to account for the Readline library’s idea of the locale encoding. Or maybe we could adjust the temporary setlocale() calls in Modules/readline.c. If you are happy to declare the Readline library is broken on Android, I now think I would prefer to skip the test based on support.is_android, rather than the previous patches. Otherwise, we risk masking genuine test failures on other platforms. Something like: @unittest.skipIf(is_android, "Gnu Readline disagrees about the locale encoding on Android") def test_nonascii(self): try: readline.add_history("\xEB\xEF") ... When you run “LANG= bash”, it is only Bash and Readline that gets the C locale; the terminal is unchanged. I presume the terminal inputs é as two UTF-8 bytes, but Readline with the C locale is not aware of UTF-8, and assumes the two bytes are two separate characters.

So the problem seems to be that Python assumes Readline’s encoding is UTF-8, but Readline actually uses ASCII (depending on locale variables). The code at the start of the test is supposed to catch when add_history() calls PyUnicode_EncodeLocale() and fails.

I don’t understand the details of UTF-8 vs locale on Android, but maybe we could adjust the encode() and decode() implementations in Modules/readline.c, to account for the Readline library’s idea of the locale encoding. Or maybe we could adjust the temporary setlocale() calls in Modules/readline.c.

If you are happy to declare the Readline library is broken on Android, I now think I would prefer to skip the test based on support.is_android, rather than the previous patches. Otherwise, we risk masking genuine test failures on other platforms. Something like:

@unittest.skipIf(is_android,
    "Gnu Readline disagrees about the locale encoding on Android")
def test_nonascii(self):
    try:
        readline.add_history("\xEB\xEF")
    ...

When you run “LANG= bash”, it is only Bash and Readline that gets the C locale; the terminal is unchanged. I presume the terminal inputs é as two UTF-8 bytes, but Readline with the C locale is not aware of UTF-8, and assumes the two bytes are two separate characters.

History
Date	User	Action	Args
2017-01-16 02:07:48	martin.panter	set	recipients: + martin.panter, xdegaye
2017-01-16 02:07:48	martin.panter	set	messageid: <1484532468.66.0.943713194332.issue28997@psf.upfronthosting.co.za>
2017-01-16 02:07:48	martin.panter	link	issue28997 messages
2017-01-16 02:07:46	martin.panter	create