classification
Title: readline-related test_builtin failure
Type: behavior Stage: patch review
Components: Tests Versions: Python 3.7, Python 3.6, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: nadeem.vawda Nosy List: Arfrever, brett.cannon, doko, ezio.melotti, haypo, martin.panter, nadeem.vawda, petri.lehtinen, python-dev, serhiy.storchaka, xdegaye
Priority: normal Keywords: patch

Created on 2012-01-27 10:23 by nadeem.vawda, last changed 2017-10-05 14:35 by serhiy.storchaka.

Files
File name Uploaded Description Edit
rl-locale.diff nadeem.vawda, 2012-01-27 10:23 Fix readline to not discard chars that can't be decoded with sys.stdin.encoding review
rl-test.diff nadeem.vawda, 2012-01-27 10:34 Ensure that input() tty tests always use readline review
input-readline.patch martin.panter, 2016-01-19 01:09 review
input-readline.v2.patch martin.panter, 2016-12-06 03:30 stderr=DEVNULL review
input-readline.v3.patch martin.panter, 2017-01-16 06:13 review
input-readline.v4.patch xdegaye, 2017-01-19 16:54 review
Messages (26)
msg152080 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-01-27 10:23
I've recently come across a strange failure in the tests for the input()
built-in function:

    $ ./python -E -m test -v test_readline test_builtin

    [... snip ...]

    ======================================================================
    FAIL: test_input_tty_non_ascii (test.test_builtin.BuiltinTest)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/nadeem/src/cpython/def/Lib/test/test_builtin.py", line 1079, in test_input_tty_non_ascii
        self.check_input_tty("prompté", b"quux\xe9", "utf-8")
      File "/home/nadeem/src/cpython/def/Lib/test/test_builtin.py", line 1070, in check_input_tty
        self.assertEqual(input_result, expected)
    AssertionError: 'quux' != 'quux\udce9'
    - quux
    + quux\udce9
    ?     +


    ======================================================================
    FAIL: test_input_tty_non_ascii_unicode_errors (test.test_builtin.BuiltinTest)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/nadeem/src/cpython/def/Lib/test/test_builtin.py", line 1083, in test_input_tty_non_ascii_unicode_errors
        self.check_input_tty("prompté", b"quux\xe9", "ascii")
      File "/home/nadeem/src/cpython/def/Lib/test/test_builtin.py", line 1070, in check_input_tty
        self.assertEqual(input_result, expected)
    AssertionError: 'quux' != 'quux\udce9'
    - quux
    + quux\udce9
    ?     +

The failure only manifests itself if the readline module is loaded before
test_builtin runs (hence the presence of test_readline above). It will
not occur if regrtest is run with either of the -j or -W flags (which is
why it hasn't been seen on the buildbots).

The problem seems to be that readline assumes that its input should use
the locale encoding, and silently strips out any undecodable chars. This
breaks the tests mentioned above, since they set up sys.stdin to use the
surrogateescape error handler, expecting invalid characters to be escaped
rather than discarded.

This problem doesn't crop up if readline is *not* loaded, because in that
case PyOS_Readline() falls back to a stdio-based implementation
(PyOS_StdioReadline()) that preserves invalid characters, allowing them
to be handled properly by sys.stdin's encoding and error handler.

I have been able to fix the test failures with the attached patch, which
stops readline from eating invalid characters, making it consistent with
the stdio-based fallback. Can someone with more knowledge of readline
and/or locale issues advise whether the change is a good idea?
msg152082 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-01-27 10:34
Here's another patch that ensures the test always exercises the GNU
readline code path (rather than the stdio fallback). This will cause the
failure to occur when running just test_builtin (no need to also run
test_readline before it).

Ideally we'd want to test both code paths, but I'm not sure how to
accomplish that reliably, short of running the test in a subprocess.
msg164253 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2012-06-28 13:03
I had exactly the same error on 3.3b1 when running the test suite with randomized order.
msg180732 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-01-27 01:42
This problem still exists on 3.4.
The attached patch seems to solve it.
msg180766 - (view) Author: Roundup Robot (python-dev) Date: 2013-01-27 13:05
New changeset cf0f450b3299 by Nadeem Vawda in branch '3.2':
Issue #13886: Fix input() to not strip out supposedly-invalid input bytes.
http://hg.python.org/cpython/rev/cf0f450b3299
msg180767 - (view) Author: Roundup Robot (python-dev) Date: 2013-01-27 13:13
New changeset 5c8732049dd5 by Nadeem Vawda in branch '3.3':
Issue #13886: Fix input() to not strip out supposedly-invalid input bytes.
http://hg.python.org/cpython/rev/5c8732049dd5
msg180769 - (view) Author: Roundup Robot (python-dev) Date: 2013-01-27 13:17
New changeset 9774721bfc32 by Nadeem Vawda in branch 'default':
Issue #13886: Fix input() to not strip out supposedly-invalid input bytes.
http://hg.python.org/cpython/rev/9774721bfc32
msg180770 - (view) Author: Roundup Robot (python-dev) Date: 2013-01-27 13:21
New changeset 12223782031f by Nadeem Vawda in branch '2.7':
Issue #13886: Fix input() to not strip out supposedly-invalid input bytes.
http://hg.python.org/cpython/rev/12223782031f
msg180837 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2013-01-28 07:45
-    setlocale(LC_CTYPE, "");
+    setlocale(LC_CTYPE, "C");

This looks dangerous to me. Are you sure readline's behavior doesn't change because of this?
msg181200 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2013-02-02 18:58
You're right; it breaks backspacing over multibyte characters. I should
have tested it more carefully before committing. I'll revert the changes.
msg181201 - (view) Author: Roundup Robot (python-dev) Date: 2013-02-02 19:05
New changeset e6952acd5a55 by Nadeem Vawda in branch '3.2':
Back out fix for issue #13886; it introduced a new bug in interactive readline use.
http://hg.python.org/cpython/rev/e6952acd5a55
msg181205 - (view) Author: Roundup Robot (python-dev) Date: 2013-02-02 19:25
New changeset 5c7e884b205a by Nadeem Vawda in branch '3.3':
Back out fix for issue #13886; it introduced a new bug in interactive readline use.
http://hg.python.org/cpython/rev/5c7e884b205a
msg181207 - (view) Author: Roundup Robot (python-dev) Date: 2013-02-02 19:29
New changeset 8b8c6abda7e8 by Nadeem Vawda in branch 'default':
Back out fix for issue #13886; it introduced a new bug in interactive readline use.
http://hg.python.org/cpython/rev/8b8c6abda7e8
msg181208 - (view) Author: Roundup Robot (python-dev) Date: 2013-02-02 19:53
New changeset 5bf91dfb1e34 by Nadeem Vawda in branch '2.7':
Back out fix for issue #13886; it introduced a new bug in interactive readline use.
http://hg.python.org/cpython/rev/5bf91dfb1e34
msg191272 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-06-16 15:18
Issue #18230 is another test_builtin failure related to tty tests.
msg211077 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-02-12 10:30
This happens even when run test_builtin twice.

    ./python -E -m test -v test_builtin test_builtin

or

    ./python -E -m test -Fv test_builtin
msg258517 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-01-18 11:33
I don’t think this affects Python 2. The failing tests were added in revision 421c8e291221, Issue 13342, for 3.2+ only. They invole input() doing text decoding. AFAIK Python 2’s equivalent, raw_input(), does not do text decoding.

I suspect we can’t really change how Readline handles text encoding errors, which seems to be what Nadeem was trying to do. I suggest to just fix the tests without changing Readline.

As far as I know there is no way to un-register the Readline module once it has been loaded. A quick and dirty workaround might be to skip the test(s) if the Readline has been loaded ("readline" in sys.modules). But a better fix would probably be to run the test in a subprocess, where we can start a new interpreter from scratch and control whether Readline is loaded.

Looking closer at the tests, they mention invoking Gnu Readline. But the associated bug fix is in the wrapper code around PyOS_Readline(), which may call Gnu Readline if it is loaded, or may call a simpler internal routine otherwise. So ideally the tests should be repeated with Readline unloaded and loaded.

Also, the comment for test_input_tty_non_ascii() implies it is testing UTF-8 encoding. But the data is not valid UTF-8 so it ends up testing the error handling. I think that test should use valid UTF-8 input.
msg258564 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-01-19 00:50
Here is a possible patch which:

* Runs the three test_input_tty() tests in a fresh child Python interpreter, for control over whether Readline is loaded. A pseudoterminal is used for stdin and stdout, and stderr is left untouched to easily see error messages.
* Repeats each test with and without Readline enabled
* Only tests the error handling without Readline
* Fixes the non-error test to use proper UTF-8 input

With my patch applied, there are a couple of prompts mixed up in the test log output via stderr, due to Issue 1927. It is not a major problem, but perhaps we should work on fixing that bug first.
msg282399 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-12-05 08:31
Since people keep coming upon this bug, perhaps we should inhibit push my fix without fixing that other prompt bug (now a feature change I think). Probably have to capture stderr to avoid it coming out in the test output.
msg282496 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-12-06 03:30
input-readline.v2.patch sets stderr=DEVNULL so that the prompt does not come out in the test log.

A disadvantage of this is that if there is a failure, any error messages, stack trace, etc is also lost. To fix this properly, we would probably have to capture stderr using a generalized communicate() loop, like I have done at the bottom of <https://hg.python.org/cpython/annotate/v3.6.0b4/Lib/test/test_readline.py#l231> (see also Issue 1260171).
msg283932 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-12-24 09:48
Hi Xavier, I was about to push input-readline.v2.patch, but I thought it might be good to check with you first if this causes problems with Android, based on my experience with Issue 28997. With the patch applied, test.test_builtin.PtyTests.test_input_tty_non_ascii() will call the input() function twice, with and without Readline enabled. It inputs non-ASCII data in into a pseudoterminal, and expects to see it in the function return value. Perhaps we may need to skip the Readline half of the test on Android.
msg283961 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-24 16:59
Thanks for waiting for a run of the patch on Android.
Indeed, test_input_tty_non_ascii fails with this patch on Android when LANG is not set and is ok when LANG=en_US.UTF-8. As you are suggesting, the test is ok when skipping the Readline half of the test. This may be done by adding 'is_android' in the 'from test.support import ...' statement and updating the test with:

    if readline and not is_android:
        tests.append(True)
msg285539 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-01-16 06:13
V3 of my patch skips the Readline tests in cases involving non-ASCII bytes when the locale seen by Readline would be ASCII. Readline may translate the non-ASCII bytes to escape sequences, and treat them as special Meta (Alt) key combinations. This behaviour depends on Readline configuration (“set convert-meta off” in /etc/inputrc in my case).

It also includes a potential workaround for Android, depending on the resolution of Issue 28997.
msg285810 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2017-01-19 16:54
With input-readline.v3.patch, test_builtin runs with success on Android api 21.

With pep538_coerce_legacy_c_locale_v4.diff that implements PEP 538 in issue 28180, and with input-readline.v3.patch modified to have 'readline_encoding = locale.getpreferredencoding()' even when 'is_android' is True, test_builtin runs with success. This means that no specific Android handling would be needed if PEP 538 were to be adopted.

The new input-readline.v4.patch is a slight improvement over the previous patch and sets readline_encoding to 'UTF-8' on Android when test_builtin is run with the environment variable LANG set to 'en_US.UTF-8' and in that case the test exercises all the code paths including those with the readline module.  This is because locale.getdefaultlocale() returns ('en_US', 'UTF-8') on Android in that case and because both getdefaultlocale() and readline scan the environment to check for the locale.  This new patch is only useful if tests on Android are expected to be run also with one of the locale environment variables set to UTF-8 (and PEP 538 is rejected).

I have left some comments on Rietveld.
msg303770 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-10-05 14:10
bpo-31703 has been marked as a duplicate of this bug.
msg303772 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-05 14:35
As well as other 5 bugs.
History
Date User Action Args
2017-10-05 14:35:14serhiy.storchakasetmessages: + msg303772
2017-10-05 14:10:39hayposetmessages: + msg303770
2017-10-05 13:50:59serhiy.storchakalinkissue31703 superseder
2017-01-19 16:54:54xdegayesetfiles: + input-readline.v4.patch

messages: + msg285810
2017-01-16 06:13:25martin.pantersetfiles: + input-readline.v3.patch

messages: + msg285539
2016-12-24 16:59:08xdegayesetmessages: + msg283961
2016-12-24 09:48:35martin.pantersetnosy: + xdegaye
messages: + msg283932
2016-12-06 03:31:00martin.pantersetfiles: + input-readline.v2.patch

dependencies: - Change input() to always prompt to stderr
messages: + msg282496
2016-12-05 08:31:43martin.pantersetmessages: + msg282399
versions: + Python 3.7
2016-12-05 07:35:23serhiy.storchakalinkissue28872 superseder
2016-06-15 09:39:00berker.peksaglinkissue27325 superseder
2016-01-19 01:09:03martin.pantersetfiles: + input-readline.patch
2016-01-19 00:50:27martin.pantersetdependencies: + Change input() to always prompt to stderr
messages: + msg258564
stage: needs patch -> patch review
2016-01-18 11:33:44martin.pantersetversions: + Python 3.5, Python 3.6, - Python 2.7, Python 3.2, Python 3.3, Python 3.4
nosy: + martin.panter

messages: + msg258517

components: + Tests, - Extension Modules
2014-07-05 22:57:18ned.deilysetnosy: + doko
2014-07-05 22:56:55ned.deilylinkissue17755 superseder
2014-02-12 10:30:57serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg211077
2013-09-16 16:10:30serhiy.storchakalinkissue17734 superseder
2013-09-16 16:08:07serhiy.storchakalinkissue18230 superseder
2013-06-16 15:18:16brett.cannonsetnosy: + brett.cannon
messages: + msg191272
2013-02-02 19:53:16python-devsetmessages: + msg181208
2013-02-02 19:29:47python-devsetmessages: + msg181207
2013-02-02 19:25:27python-devsetmessages: + msg181205
2013-02-02 19:05:19python-devsetmessages: + msg181201
2013-02-02 18:58:07nadeem.vawdasetstatus: closed -> open
resolution: fixed ->
messages: + msg181200

stage: resolved -> needs patch
2013-01-28 07:45:38petri.lehtinensetmessages: + msg180837
2013-01-27 13:24:30nadeem.vawdasetstatus: open -> closed
assignee: nadeem.vawda
stage: patch review -> resolved
resolution: fixed
versions: + Python 2.7
2013-01-27 13:21:36python-devsetmessages: + msg180770
2013-01-27 13:17:30python-devsetmessages: + msg180769
2013-01-27 13:13:35python-devsetmessages: + msg180767
2013-01-27 13:05:00python-devsetnosy: + python-dev
messages: + msg180766
2013-01-27 01:42:19ezio.melottisetnosy: + ezio.melotti

messages: + msg180732
versions: + Python 3.4
2012-09-28 19:45:44petri.lehtinensetnosy: + haypo
2012-09-16 08:29:46Arfreversetnosy: + Arfrever
2012-06-28 13:03:56petri.lehtinensetnosy: + petri.lehtinen
messages: + msg164253
2012-01-27 10:34:45nadeem.vawdasetfiles: + rl-test.diff

messages: + msg152082
2012-01-27 10:23:39nadeem.vawdacreate