classification
Title: Unexpected call to readline's add_history in call_readline
Type: behavior Stage: needs patch
Components: Extension Modules Versions: Python 3.6, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: martin.panter, meador.inge, python-dev, twouters, tylercrompton
Priority: normal Keywords: patch

Created on 2016-04-27 07:52 by tylercrompton, last changed 2016-05-15 16:01 by martin.panter.

Files
File name Uploaded Description Edit
readline_history.py tylercrompton, 2016-04-27 07:52 script that demonstrates the described behavior
set_auto_history.patch tylercrompton, 2016-05-06 09:41 Patch to implement new function in readline module review
test_set_auto_history.py tylercrompton, 2016-05-06 09:45 Test script for new function in readline module
set_auto_history.v2.patch martin.panter, 2016-05-09 05:29 review
Messages (18)
msg264360 - (view) Author: Tyler Crompton (tylercrompton) * Date: 2016-04-27 07:52
I was implementing a REPL using the readline module and noticed that there are extraneous calls to readline's add_history function in call_readline[1]. This was a problem because there were some lines, that, based on their compositions, I might not want in the history. Figuring out why I was getting two entries for every 

The function call has been around ever since Python started supporting GNU Readline (first appeared in Python 1.4 or so, I believe)[2]. This behavior doesn't seem to be documented anywhere.

I can't seem to find any code that depends on a line that is read in by call_readline to be added to the history. I guess the user might rely on the interactive interpreter to use the history feature. Beyond that, I can't think of any critical purpose for it.

There are four potential workarounds:

1. Don't use the input function. Unfortunately, this is a non-solution as it prevents one from using Readline/libedit for input operations.
2. Don't use Readline/libedit. For the same reasons, this isn't a good solution.
3. Evaluate get_current_history_length() and store its result. Evaluate input(). Evaluate get_current_history_length() again. If the length changed, execute readline.remove_history_item(readline.get_current_history_length() - 1). Note that one can't assume that the length will change after a call to input, because blank lines aren't added to the history. This isn't an ideal solution for obvious reasons. It's a bit convoluted.
4. Use some clever combination of readline.get_line_buffer, tty.setcbreak, termios.tcgetattr, termios.tcsetattr, msvcrt.getwche, and try-except-finally blocks. Besides the obvious complexities in this solution, this isn't particularly platform-independent.

I think that it's fair to say that none of the above options are desirable. So let's discuss potential solutions.

1. Remove this feature from call_readline. Not only will this cause a regression in the interactive interpreter, many people rely on this behavior when using the readline module.
2. Dynamically swap histories (or readline configurations in general) between readline-capable calls to input and prompts in the interactive interpreter. This would surely be too fragile and add unnecessary overhead.
3. Document this behavior and leave the code alone. I wouldn't say that this is a solution, but it would at least help other developers that would fall in the same trap that I did.
4. Add a keyword argument to input to instruct call_readline to not add the line to the history. Personally, this seems a bit dirty.
5. Add a readline function in the readline module that doesn't rely on call_readline. Admittedly, the implementation would have to look eerily similar to call_readline, so perhaps there could be a flag on call_readline. However, that would require touching a few files that don't seem to be particularly related. But a new function might be confusing since call_readline sounds like a name that you'd give such a function.

I think that the last option would be a pretty clean change that would cause the least number of issues (if any) for existing code bases. Regardless of the implementation details, I think that this would be the best route—to add a Python function called readline to the readline module. I would imagine that this would be an easy change/addition.

I'm attaching a sample script that demonstrates the described issue.

[1]: https://github.com/python/cpython/blob/fa3fc6d78ee0ce899c9c828e796b66114400fbf0/Modules/readline.c#L1283
[2]: https://github.com/python/cpython/commit/e59c3ba80888034ef0e4341567702cd91e7ef70d
msg264370 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-27 10:02
Solution 3 (documentation) can easily be done for existing versions of Python. Any other fix I think would have to be done in the next Python version (3.6).

By solution 2 (dynamically swap histories), do you mean add an API to save and load the entire history list somewhere, like read/write_history_file(), but maybe to a Python list instead?

A variation on solution 4 came to my mind: Add a global variable or function to the readline module to enable or disable automatic history addition. Or maybe a callback that is run after a line is entered, with it set to the current implementation by default.

If I understand solution 5 (new version of call_readline), that also seems reasonable. I guess you mean to add a new function to replace the Python call to input()? A minor downside would be that the caller has to do extra work if it wants to use input() when the Readline library is unavailable.
msg264378 - (view) Author: Tyler Crompton (tylercrompton) * Date: 2016-04-27 13:34
I agree about the documentation. It would only take a few minutes to do.

In regard to your inquiry about the second solution, more or less, yes. I left that one a bit ambiguous since there are many ways to skin that cat. But in retrospect, I probably shouldn't have included that potential solution since it'd be a bit goofy and wouldn't necessarily be much different (in terms of code conciseness) than the third workaround that I mentioned.

As for your suggestion about the fourth solution, I like that idea. I feel that it's actually more similar to the fifth solution than the fourth, but I feel that we're getting close to coming up with something that should be easy and effective.

Lastly, in regard to the fifth solution, yes. But I see what you're saying about the downside. Yeah, that would be rather annoying. For a moment, I thought that it could fall back to standard IO in the absence of Readline/libedit, but that would be a bit misleading for a function in the readline module. Besides, input already does that anyway.

I would imagine that in the vast majority of cases of using such a new function, the developer would fallback to input, because they'd likely prioritize getting the content from the user over ensuring Readline/libedit functionality. Since we already have a function that automatically does that, I think your suggestion to add a function to the readline module to enable/disable automatic history addition would be ideal. I'd be reluctant to use a global Python variable since it would be out of uniform with the rest of the members (i.e., functions) of the module.

I like the idea of adding a set_auto_history(flag=True|False) or something to that effect.
msg264955 - (view) Author: Tyler Crompton (tylercrompton) * Date: 2016-05-06 09:45
I couldn't think of a way to test input() with the unittest module without bypassing readline or requesting the input from the user. With that said, this informal script verifies proper behavior.
msg265119 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-08 08:56
I left a few minor comments in the code review.

I agree automated testing would be awkward for Readline. It should be possible using a pseudoterminal (pty). In fact there is already very basic testing that does this in /Lib/test/test_builtin.py, class PtyTests. It only tests the input() prompt.

I could have a go at writing a test. I guess pseudocode for a test would look a bit like:

def run_pty(script):
    [master, slave] = pty.openpty()
    with subprocess.Popen(script, stdin=slave, stdout=slave, stderr=slave)
        # Read and write concurrently like proc.communicate()
        master.write(b"dummy input\r")
        return slave.read()

template = """\
import readline
readline.set_auto_history({})
input()
print("History length:", readline.get_current_history_length())
"""

def test_auto_history_enabled(self):
    output = run_session(template.format(True))
    self.assertIn(b"History length: 1\n", output)

def test_auto_history_disabled(self):
    output = run_session(template.format(False))
    self.assertIn(b"History length: 0\n", output)
msg265184 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-09 05:29
Here is a patch including a unit test. I have only tested it on Linux. It would be awesome of other people could test it on other Unix platforms, in case the pseudoterminal stuff needs fixing.
msg265192 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2016-05-09 13:49
Works fine on Darwin (OS X Yosemite 10.10.5):

drago:cpython meadori$ uname -a
Darwin drago 14.5.0 Darwin Kernel Version 14.5.0: Mon Jan 11 18:48:35 PST 2016; root:xnu-2782.50.2~1/RELEASE_X86_64 x86_64
msg265497 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-13 23:40
Thankyou for testing this Meador, it increases my confidence in my code. I’ll try to commit this soon if there are no objections.
msg265567 - (view) Author: Roundup Robot (python-dev) Date: 2016-05-15 02:07
New changeset 4195fa81b188 by Martin Panter in branch 'default':
Issue #26870: Add readline.set_auto_history(), originally by Tyler Crompton
https://hg.python.org/cpython/rev/4195fa81b188
msg265570 - (view) Author: Roundup Robot (python-dev) Date: 2016-05-15 03:07
New changeset 27a49daf7925 by Martin Panter in branch 'default':
Issue #26870: Close pty master in case of exception
https://hg.python.org/cpython/rev/27a49daf7925
msg265571 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-15 03:12
My test locked up on an OS X buildbot <http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/4623/steps/test/logs/stdio>:

Timeout (0:15:00)!
Thread 0x00007fff71296cc0 (most recent call first):
  File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/subprocess.py", line 1612 in _try_wait
  File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/subprocess.py", line 1662 in wait
  File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/subprocess.py", line 1003 in __exit__
  File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/test_readline.py", line 146 in run_pty
  File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/test_readline.py", line 115 in test_auto_history_disabled

The parent was hanging as it exits the child proc context manager, waiting for the child to exit. Perhaps some exception happened in the parent before it could write to the child, so the child is still waiting on input. So I added code to close the master in case of exception, which may at least help understand the situation.
msg265572 - (view) Author: Tyler Crompton (tylercrompton) * Date: 2016-05-15 03:36
I suppose the only thing that could be left is adding remarks in the documentations for previous versions. If I understand correctly, this would only be added to the documentations for Python 2.7 and 3.5. Is this correct?

Since this is the first issue in which I've submitted a patch, I'm still quite new to the CPython development workflow; is there a special way to indicate that a patch should be applied to a branch other than default? Or is that done simply by informally indicating so in a message?
msg265574 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-15 03:57
Yes 3.5 and 2.7 are open for documentation fixes. As long as there are no major differences, it is usually easier to make one patch against 3.5 or 3.6, and then I can fix any minor differences when applying it to 2.7.
msg265575 - (view) Author: Roundup Robot (python-dev) Date: 2016-05-15 04:07
New changeset b8b2c5cc7e9d by Martin Panter in branch 'default':
Issue #26870: Temporary debugging for OS X Snow Leopard lockup
https://hg.python.org/cpython/rev/b8b2c5cc7e9d
msg265604 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-15 09:28
Unfortunately, I think my debugging messages needed a flush() call to be effective when the deadlock happens.

So the test is failing on three buildbots:

* AMD64 Snow Leop 3.x: stuck at Lib/subprocess.py:1612, os.waitpid() waiting for child to exit
* x86 OpenBSD 3.x: stuck at Lib/selectors.py:569, sel.select() -> self._kqueue.control() waiting for pty event
* x86 Tiger 3.x: sel.register() -> self._kqueue.control() raises ENOTSUP

On the other hand, the test is passing on Free BSD, Open Indiana, AIX, and Linux (+ Meador’s OS X Yosemite). I think I might just skip the test on the problematic platforms, unless anyone has any suggestions for these other platforms:

@unittest.skipIf(sys.platform.startswith(("darwin", "openbsd")))
msg265614 - (view) Author: Roundup Robot (python-dev) Date: 2016-05-15 13:23
New changeset 816e1fe72c1e by Martin Panter in branch 'default':
Issue #26870: Avoid using kqueue() with pseudo-terminals
https://hg.python.org/cpython/rev/816e1fe72c1e
msg265618 - (view) Author: Roundup Robot (python-dev) Date: 2016-05-15 15:05
New changeset daaead1dc3e0 by Martin Panter in branch 'default':
Issue #26870: Poll() also fails on OS X; try select()
https://hg.python.org/cpython/rev/daaead1dc3e0
msg265626 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-15 16:01
Looks like I have ironed all the bugs out. These OS bugs/quirks have already been documented in the Python bug tracker:

1. kqueue() incompatible with PTYs on OS X < 10.9 and Open BSD (Issue Issue 20365, Issue 20667 respectively)
2. poll() also incompatible with PTYs on OS X (10.4, 10.6), probably the same as Issue 20472
3. kill() raises ESRCH if the pid is a zombie (Issue 16762)

I will leave this open for a documentation patch.
History
Date User Action Args
2016-05-15 16:01:48martin.pantersetmessages: + msg265626
stage: commit review -> needs patch
2016-05-15 15:05:47python-devsetmessages: + msg265618
2016-05-15 13:23:00python-devsetmessages: + msg265614
2016-05-15 09:28:20martin.pantersetmessages: + msg265604
2016-05-15 04:07:11python-devsetmessages: + msg265575
2016-05-15 03:57:03martin.pantersetmessages: + msg265574
versions: + Python 2.7, Python 3.5
2016-05-15 03:36:47tylercromptonsetmessages: + msg265572
2016-05-15 03:12:14martin.pantersetmessages: + msg265571
2016-05-15 03:07:54python-devsetmessages: + msg265570
2016-05-15 02:07:45python-devsetnosy: + python-dev
messages: + msg265567
2016-05-13 23:40:38martin.pantersetmessages: + msg265497
stage: patch review -> commit review
2016-05-09 13:49:24meador.ingesetnosy: + meador.inge
messages: + msg265192
2016-05-09 05:29:10martin.pantersetfiles: + set_auto_history.v2.patch

messages: + msg265184
2016-05-08 08:56:34martin.pantersetstage: patch review
messages: + msg265119
versions: - Python 2.7, Python 3.5
2016-05-06 09:46:46tylercromptonsetstatus: pending -> open
2016-05-06 09:45:48tylercromptonsetstatus: open -> pending
files: + test_set_auto_history.py
messages: + msg264955
2016-05-06 09:41:54tylercromptonsetfiles: + set_auto_history.patch
keywords: + patch
2016-04-27 13:34:58tylercromptonsetmessages: + msg264378
2016-04-27 10:02:19martin.pantersetversions: - Python 3.2, Python 3.3, Python 3.4
nosy: + martin.panter

messages: + msg264370

components: + Extension Modules, - Library (Lib)
2016-04-27 08:14:06tylercromptonsetnosy: + twouters
2016-04-27 07:52:14tylercromptoncreate