Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected call to readline's add_history in call_readline #71057

Open
tylercrompton mannequin opened this issue Apr 27, 2016 · 18 comments
Open

Unexpected call to readline's add_history in call_readline #71057

tylercrompton mannequin opened this issue Apr 27, 2016 · 18 comments
Labels
extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error

Comments

@tylercrompton
Copy link
Mannequin

tylercrompton mannequin commented Apr 27, 2016

BPO 26870
Nosy @Yhg1s, @meadori, @vadmium, @tylercrompton
Files
  • readline_history.py: script that demonstrates the described behavior
  • set_auto_history.patch: Patch to implement new function in readline module
  • test_set_auto_history.py: Test script for new function in readline module
  • set_auto_history.v2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2016-04-27.07:52:14.793>
    labels = ['extension-modules', 'type-bug']
    title = "Unexpected call to readline's add_history in call_readline"
    updated_at = <Date 2016-05-15.16:01:48.548>
    user = 'https://github.com/tylercrompton'

    bugs.python.org fields:

    activity = <Date 2016-05-15.16:01:48.548>
    actor = 'martin.panter'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Extension Modules']
    creation = <Date 2016-04-27.07:52:14.793>
    creator = 'tylercrompton'
    dependencies = []
    files = ['42626', '42748', '42749', '42788']
    hgrepos = []
    issue_num = 26870
    keywords = ['patch']
    message_count = 18.0
    messages = ['264360', '264370', '264378', '264955', '265119', '265184', '265192', '265497', '265567', '265570', '265571', '265572', '265574', '265575', '265604', '265614', '265618', '265626']
    nosy_count = 5.0
    nosy_names = ['twouters', 'meador.inge', 'python-dev', 'martin.panter', 'tylercrompton']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'needs patch'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue26870'
    versions = ['Python 2.7', 'Python 3.5', 'Python 3.6']

    @tylercrompton
    Copy link
    Mannequin Author

    tylercrompton mannequin commented Apr 27, 2016

    I was implementing a REPL using the readline module and noticed that there are extraneous calls to readline's add_history function in call_readline1. This was a problem because there were some lines, that, based on their compositions, I might not want in the history. Figuring out why I was getting two entries for every

    The function call has been around ever since Python started supporting GNU Readline (first appeared in Python 1.4 or so, I believe)2. This behavior doesn't seem to be documented anywhere.

    I can't seem to find any code that depends on a line that is read in by call_readline to be added to the history. I guess the user might rely on the interactive interpreter to use the history feature. Beyond that, I can't think of any critical purpose for it.

    There are four potential workarounds:

    1. Don't use the input function. Unfortunately, this is a non-solution as it prevents one from using Readline/libedit for input operations.
    2. Don't use Readline/libedit. For the same reasons, this isn't a good solution.
    3. Evaluate get_current_history_length() and store its result. Evaluate input(). Evaluate get_current_history_length() again. If the length changed, execute readline.remove_history_item(readline.get_current_history_length() - 1). Note that one can't assume that the length will change after a call to input, because blank lines aren't added to the history. This isn't an ideal solution for obvious reasons. It's a bit convoluted.
    4. Use some clever combination of readline.get_line_buffer, tty.setcbreak, termios.tcgetattr, termios.tcsetattr, msvcrt.getwche, and try-except-finally blocks. Besides the obvious complexities in this solution, this isn't particularly platform-independent.

    I think that it's fair to say that none of the above options are desirable. So let's discuss potential solutions.

    1. Remove this feature from call_readline. Not only will this cause a regression in the interactive interpreter, many people rely on this behavior when using the readline module.
    2. Dynamically swap histories (or readline configurations in general) between readline-capable calls to input and prompts in the interactive interpreter. This would surely be too fragile and add unnecessary overhead.
    3. Document this behavior and leave the code alone. I wouldn't say that this is a solution, but it would at least help other developers that would fall in the same trap that I did.
    4. Add a keyword argument to input to instruct call_readline to not add the line to the history. Personally, this seems a bit dirty.
    5. Add a readline function in the readline module that doesn't rely on call_readline. Admittedly, the implementation would have to look eerily similar to call_readline, so perhaps there could be a flag on call_readline. However, that would require touching a few files that don't seem to be particularly related. But a new function might be confusing since call_readline sounds like a name that you'd give such a function.

    I think that the last option would be a pretty clean change that would cause the least number of issues (if any) for existing code bases. Regardless of the implementation details, I think that this would be the best route—to add a Python function called readline to the readline module. I would imagine that this would be an easy change/addition.

    I'm attaching a sample script that demonstrates the described issue.

    @tylercrompton tylercrompton mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Apr 27, 2016
    @vadmium
    Copy link
    Member

    vadmium commented Apr 27, 2016

    Solution 3 (documentation) can easily be done for existing versions of Python. Any other fix I think would have to be done in the next Python version (3.6).

    By solution 2 (dynamically swap histories), do you mean add an API to save and load the entire history list somewhere, like read/write_history_file(), but maybe to a Python list instead?

    A variation on solution 4 came to my mind: Add a global variable or function to the readline module to enable or disable automatic history addition. Or maybe a callback that is run after a line is entered, with it set to the current implementation by default.

    If I understand solution 5 (new version of call_readline), that also seems reasonable. I guess you mean to add a new function to replace the Python call to input()? A minor downside would be that the caller has to do extra work if it wants to use input() when the Readline library is unavailable.

    @vadmium vadmium added extension-modules C modules in the Modules dir and removed stdlib Python modules in the Lib dir labels Apr 27, 2016
    @tylercrompton
    Copy link
    Mannequin Author

    tylercrompton mannequin commented Apr 27, 2016

    I agree about the documentation. It would only take a few minutes to do.

    In regard to your inquiry about the second solution, more or less, yes. I left that one a bit ambiguous since there are many ways to skin that cat. But in retrospect, I probably shouldn't have included that potential solution since it'd be a bit goofy and wouldn't necessarily be much different (in terms of code conciseness) than the third workaround that I mentioned.

    As for your suggestion about the fourth solution, I like that idea. I feel that it's actually more similar to the fifth solution than the fourth, but I feel that we're getting close to coming up with something that should be easy and effective.

    Lastly, in regard to the fifth solution, yes. But I see what you're saying about the downside. Yeah, that would be rather annoying. For a moment, I thought that it could fall back to standard IO in the absence of Readline/libedit, but that would be a bit misleading for a function in the readline module. Besides, input already does that anyway.

    I would imagine that in the vast majority of cases of using such a new function, the developer would fallback to input, because they'd likely prioritize getting the content from the user over ensuring Readline/libedit functionality. Since we already have a function that automatically does that, I think your suggestion to add a function to the readline module to enable/disable automatic history addition would be ideal. I'd be reluctant to use a global Python variable since it would be out of uniform with the rest of the members (i.e., functions) of the module.

    I like the idea of adding a set_auto_history(flag=True|False) or something to that effect.

    @tylercrompton
    Copy link
    Mannequin Author

    tylercrompton mannequin commented May 6, 2016

    I couldn't think of a way to test input() with the unittest module without bypassing readline or requesting the input from the user. With that said, this informal script verifies proper behavior.

    @vadmium
    Copy link
    Member

    vadmium commented May 8, 2016

    I left a few minor comments in the code review.

    I agree automated testing would be awkward for Readline. It should be possible using a pseudoterminal (pty). In fact there is already very basic testing that does this in /Lib/test/test_builtin.py, class PtyTests. It only tests the input() prompt.

    I could have a go at writing a test. I guess pseudocode for a test would look a bit like:

    def run_pty(script):
        [master, slave] = pty.openpty()
        with subprocess.Popen(script, stdin=slave, stdout=slave, stderr=slave)
            # Read and write concurrently like proc.communicate()
            master.write(b"dummy input\r")
            return slave.read()
    
    template = """\
    import readline
    readline.set_auto_history({})
    input()
    print("History length:", readline.get_current_history_length())
    """
    
    def test_auto_history_enabled(self):
        output = run_session(template.format(True))
        self.assertIn(b"History length: 1\n", output)
    
    def test_auto_history_disabled(self):
        output = run_session(template.format(False))
        self.assertIn(b"History length: 0\n", output)

    @vadmium
    Copy link
    Member

    vadmium commented May 9, 2016

    Here is a patch including a unit test. I have only tested it on Linux. It would be awesome of other people could test it on other Unix platforms, in case the pseudoterminal stuff needs fixing.

    @meadori
    Copy link
    Member

    meadori commented May 9, 2016

    Works fine on Darwin (OS X Yosemite 10.10.5):

    drago:cpython meadori$ uname -a
    Darwin drago 14.5.0 Darwin Kernel Version 14.5.0: Mon Jan 11 18:48:35 PST 2016; root:xnu-2782.50.2~1/RELEASE_X86_64 x86_64

    @vadmium
    Copy link
    Member

    vadmium commented May 13, 2016

    Thankyou for testing this Meador, it increases my confidence in my code. I’ll try to commit this soon if there are no objections.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 15, 2016

    New changeset 4195fa81b188 by Martin Panter in branch 'default':
    Issue bpo-26870: Add readline.set_auto_history(), originally by Tyler Crompton
    https://hg.python.org/cpython/rev/4195fa81b188

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 15, 2016

    New changeset 27a49daf7925 by Martin Panter in branch 'default':
    Issue bpo-26870: Close pty master in case of exception
    https://hg.python.org/cpython/rev/27a49daf7925

    @vadmium
    Copy link
    Member

    vadmium commented May 15, 2016

    My test locked up on an OS X buildbot <http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/4623/steps/test/logs/stdio\>:

    Timeout (0:15:00)!
    Thread 0x00007fff71296cc0 (most recent call first):
    File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/subprocess.py", line 1612 in _try_wait
    File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/subprocess.py", line 1662 in wait
    File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/subprocess.py", line 1003 in __exit__
    File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/test_readline.py", line 146 in run_pty
    File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/test_readline.py", line 115 in test_auto_history_disabled

    The parent was hanging as it exits the child proc context manager, waiting for the child to exit. Perhaps some exception happened in the parent before it could write to the child, so the child is still waiting on input. So I added code to close the master in case of exception, which may at least help understand the situation.

    @tylercrompton
    Copy link
    Mannequin Author

    tylercrompton mannequin commented May 15, 2016

    I suppose the only thing that could be left is adding remarks in the documentations for previous versions. If I understand correctly, this would only be added to the documentations for Python 2.7 and 3.5. Is this correct?

    Since this is the first issue in which I've submitted a patch, I'm still quite new to the CPython development workflow; is there a special way to indicate that a patch should be applied to a branch other than default? Or is that done simply by informally indicating so in a message?

    @vadmium
    Copy link
    Member

    vadmium commented May 15, 2016

    Yes 3.5 and 2.7 are open for documentation fixes. As long as there are no major differences, it is usually easier to make one patch against 3.5 or 3.6, and then I can fix any minor differences when applying it to 2.7.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 15, 2016

    New changeset b8b2c5cc7e9d by Martin Panter in branch 'default':
    Issue bpo-26870: Temporary debugging for OS X Snow Leopard lockup
    https://hg.python.org/cpython/rev/b8b2c5cc7e9d

    @vadmium
    Copy link
    Member

    vadmium commented May 15, 2016

    Unfortunately, I think my debugging messages needed a flush() call to be effective when the deadlock happens.

    So the test is failing on three buildbots:

    • AMD64 Snow Leop 3.x: stuck at Lib/subprocess.py:1612, os.waitpid() waiting for child to exit
    • x86 OpenBSD 3.x: stuck at Lib/selectors.py:569, sel.select() -> self._kqueue.control() waiting for pty event
    • x86 Tiger 3.x: sel.register() -> self._kqueue.control() raises ENOTSUP

    On the other hand, the test is passing on Free BSD, Open Indiana, AIX, and Linux (+ Meador’s OS X Yosemite). I think I might just skip the test on the problematic platforms, unless anyone has any suggestions for these other platforms:

    @unittest.skipIf(sys.platform.startswith(("darwin", "openbsd")))

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 15, 2016

    New changeset 816e1fe72c1e by Martin Panter in branch 'default':
    Issue bpo-26870: Avoid using kqueue() with pseudo-terminals
    https://hg.python.org/cpython/rev/816e1fe72c1e

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 15, 2016

    New changeset daaead1dc3e0 by Martin Panter in branch 'default':
    Issue bpo-26870: Poll() also fails on OS X; try select()
    https://hg.python.org/cpython/rev/daaead1dc3e0

    @vadmium
    Copy link
    Member

    vadmium commented May 15, 2016

    Looks like I have ironed all the bugs out. These OS bugs/quirks have already been documented in the Python bug tracker:

    1. kqueue() incompatible with PTYs on OS X < 10.9 and Open BSD (Issue bpo-20365, bpo-20667 respectively)
    2. poll() also incompatible with PTYs on OS X (10.4, 10.6), probably the same as bpo-20472
    3. kill() raises ESRCH if the pid is a zombie (bpo-16762)

    I will leave this open for a documentation patch.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants