Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyRun_InteractiveLoop fails to run interactively when using a Linux pty that's not tied to stdin/stdout #59121

Closed
KevinBarry mannequin opened this issue May 25, 2012 · 20 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@KevinBarry
Copy link
Mannequin

KevinBarry mannequin commented May 25, 2012

BPO 14916
Nosy @pmp-p, @miss-islington
PRs
  • bpo-14916: use specified tokenizer fd for file input #22190
  • bpo-14916: use specified tokenizer fd for file input #31006
  • [3.10] bpo-14916: use specified tokenizer fd for file input (GH-31006) #31065
  • Files
  • working.c: A test program that hopefully causes the behavior described. You need xterm, and libreadline and libncurses (with the respective headers.)
  • Python-2.6.8-Run_Interactive-fix.patch
  • working2.c: A test program to be used with a version of Python built with the patch (Python-2.6.8-Run_Interactive-fix.patch) above.
  • Python-2.6.6-Run_Interactive-fix.patch: The second iteration of a patch to fix interactivity from the C API.
  • working3.c: A simplified version of the previous example that demonstrates the problem (before patch) and proper functionality (after patch.)
  • bug.sh
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2012-05-25.15:42:37.991>
    labels = ['interpreter-core', 'type-bug', '3.8', '3.9', '3.7', 'library']
    title = "PyRun_InteractiveLoop fails to run interactively when using a Linux pty that's not tied to stdin/stdout"
    updated_at = <Date 2022-02-03.23:32:27.732>
    user = 'https://bugs.python.org/KevinBarry'

    bugs.python.org fields:

    activity = <Date 2022-02-03.23:32:27.732>
    actor = 'miss-islington'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core', 'Library (Lib)']
    creation = <Date 2012-05-25.15:42:37.991>
    creator = 'Kevin.Barry'
    dependencies = []
    files = ['25706', '26491', '26492', '26501', '26503', '29432']
    hgrepos = []
    issue_num = 14916
    keywords = ['patch']
    message_count = 19.0
    messages = ['161586', '166258', '166259', '166301', '166303', '172411', '184395', '184399', '184403', '184410', '184537', '184812', '185144', '185229', '185250', '252224', '376640', '412317', '412486']
    nosy_count = 5.0
    nosy_names = ['pmpp', 'Kevin.Barry', 'emmanuel', 'Yauheni Kaliuta', 'miss-islington']
    pr_nums = ['22190', '31006', '31065']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue14916'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9']

    @KevinBarry
    Copy link
    Mannequin Author

    KevinBarry mannequin commented May 25, 2012

    I have been trying to get PyRun_InteractiveLoop to run on a pty (Linux) without replacing stdin and stdout with that pty; however, it seems like Python (2.6.6) is hard-coded to only run interactively on stdin and stdout.

    Compile the attached program with:

    gcc python-config --cflags working.c -o working python-config --ldflags

    and run it with:

    ./working xterm -S/0

    and you should see that there is no interactivity in the xterm that's opened.

    Compile the attached file with:

    gcc -DREADLINE_HACK python-config --cflags working.c -o working python-config --ldflags -lreadline -lcurses

    and run it with:

    ./working xterm -S/0

    to see how it runs with my best attempt to get it to function properly with a readline hack. Additionally, try running:

    ./working xterm -S/0 > /dev/null
    ./working xterm -S/0 < /dev/null

    both of which should cause interactivity in the xterm to fail, indicating that Python is checking stdin/stdout for tty status when determining if it should run interactively (i.e. it's not checking the tty status of the file passed to PyRun_InteractiveLoop.)

    Am I somehow using this function wrong? I've been trying to work around this problem for a while, and I don't think I should be using readline hacks (especially since they don't port to other OSes with ptys, e.g. OS X.) I even tried to patch the call to PyOS_Readline in tok_nextc (Parser/tokenizer.c) to use tok->fp instead of stdin/stdout, which caused I/O to use the pty but it still failed to make interactivity work.

    Thanks!

    Kevin Barry

    @KevinBarry KevinBarry mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels May 25, 2012
    @KevinBarry
    Copy link
    Mannequin Author

    KevinBarry mannequin commented Jul 24, 2012

    Here is a patch that corrects the problem (quoted below and attached.) This only corrects the problem when 'PyOS_ReadlineFunctionPointer' is set, e.g. you must 'import readline', otherwise Python will defer to stdin/stdout with 'PyOS_StdioReadline'.

    The patch:

    --- Python-2.6.8/Parser/tokenizer.c     2012-04-10 11:32:11.000000000 -0400
    +++ Python-2.6.8-patched/Parser/tokenizer.c     2012-07-23 19:56:39.645992101 -0400
    @@ -805,7 +805,7 @@
                 return Py_CHARMASK(*tok->cur++);
             }
             if (tok->prompt != NULL) {
    -            char *newtok = PyOS_Readline(stdin, stdout, tok->prompt);
    +            char *newtok = PyOS_Readline(tok->fp? tok->fp : stdin, tok->fp? tok->fp : stdout, tok->prompt);
                 if (tok->nextprompt != NULL)
                     tok->prompt = tok->nextprompt;
                 if (newtok == NULL)

    Kevin Barry

    @KevinBarry
    Copy link
    Mannequin Author

    KevinBarry mannequin commented Jul 24, 2012

    I've attached a new example source file to demonstrate the fix.

    Compile the attached program with (after patching and installing Python):

    gcc python-config --cflags working2.c -o working2 python-config --ldflags

    and run it with:

    ./working2 xterm -S/0 < /dev/null > /dev/null

    (The redirection shows that it works when stdin/stdout aren't a tty.)

    I looked at the most-recent revision of tokenizer.c (http://hg.python.org/cpython/file/52032b13243e/Parser/tokenizer.c) and see that the change in my patch above hasn't been made already.

    Kevin Barry

    @KevinBarry
    Copy link
    Mannequin Author

    KevinBarry mannequin commented Jul 24, 2012

    The patch from before needed a slight modification for when Python actually defaults to an interactive session on stdin. Since I rebuild this for my current distro (Slackware64 13.37,) I switched to the Python 2.6.6 source. This might not be the proper way to handle the default case (e.g. 'Py_Main'), but it's a start.

    The patch (also attached):

    --- ./Parser/tokenizer.c.orig   2012-07-23 22:24:56.513992301 -0400
    +++ ./Parser/tokenizer.c        2012-07-23 22:23:24.329992167 -0400
    @@ -805,7 +805,7 @@
                 return Py_CHARMASK(*tok->cur++);
             }
             if (tok->prompt != NULL) {
    -            char *newtok = PyOS_Readline(stdin, stdout, tok->prompt);
    +            char *newtok = PyOS_Readline(tok->fp? tok->fp : stdin, (tok->fp && tok->fp != stdin)? tok->fp : stdout, tok->prompt);
                 if (tok->nextprompt != NULL)
                     tok->prompt = tok->nextprompt;
                 if (newtok == NULL)

    Kevin Barry

    @KevinBarry
    Copy link
    Mannequin Author

    KevinBarry mannequin commented Jul 24, 2012

    I've attached a simplified example program (working3.c) that demonstrates both the original problem and that the patch (Python-2.6.6-Run_Interactive-fix.patch) works. It eliminates the need for a pty, 'xterm', and redirection.

    Compile the attached program with:

    gcc python-config --cflags working3.c -o working3 python-config --ldflags

    and run it with (before and after patching):

    ./working3

    Also, for verification, run 'python' with no arguments to show that default interactivity is preserved.

    Kevin Barry

    @KevinBarry
    Copy link
    Mannequin Author

    KevinBarry mannequin commented Oct 8, 2012

    I still see the potential cause addressed by my patch in the 2.7, 3.3, and "default" branches, so I'm assuming that all versions from 2.6 on have this problem.

    I also see that I can elect to change the "Status" and "Resolution" of this report. Does that mean I need to do something besides wait for someone involved in the project to look at my patch?

    Kevin Barry

    @emmanuel
    Copy link
    Mannequin

    emmanuel mannequin commented Mar 17, 2013

    run the attached shell script to observe the bug
    ./bug.sh 0 -> shows the bug
    ./bug.sh 1 -> shows the expected behaviour (using a workaround)
    tested on linux with python 2.7

    @KevinBarry
    Copy link
    Mannequin Author

    KevinBarry mannequin commented Mar 17, 2013

    emmanuel,

    Thanks for the suggestion. Your workaround is exactly the same as using dup2 (in C) to replace stdin/stdout/stderr with the pty, however. If you added the following lines to your C code, it would have the same effect as the command-line redirection in the workaround:

    dup2(fileno(file), STDIN_FILENO);
    dup2(fileno(file), STDOUT_FILENO);
    dup2(fileno(file), STDERR_FILENO);

    In fact, that's exactly what bash does after forking, just before executing "exe". In most cases, developers who use PyRun_InteractiveLoop in a pty probably also do exactly that, which is why I'm the only one who's reported this as a bug. For applications like mine, however, where the interactive Python session needs to be an unobtrusive add-on to an otherwise-complete program, this solution won't work. The standard file descriptors aren't disposable in most of the programs I work on.

    Thanks again!

    Kevin Barry

    @emmanuel
    Copy link
    Mannequin

    emmanuel mannequin commented Mar 17, 2013

    Kevin,

    Indeed the code I submitted can be written entirely in C using pipe fork execl dup2 etc. as you suggest. The only purpose of mixing bash and C is to have a short self-contained file showing the problem.

    Anyway, whether in C or bash the workaround is less than satisfying in that it uses up fds 0 1 2, and I have exactly the same goal and constraints as you.

    Finally, thanks for identifying the limitation in the python implementation and submitting a patch. Like you I hope it will be eventually applied.

    @KevinBarry
    Copy link
    Mannequin Author

    KevinBarry mannequin commented Mar 18, 2013

    One additional issue, which my patch doesn't address, is that PyRun_InteractiveLoop should really take *two* FILE* arguments, with the second one being optional. This is because on Linux (and presumably on other *nixes) if a read operation is blocked on a file descriptor then write operations (from other threads) to the same file descriptor will also block. That doesn't happen in the current Python implementations because PyOS_Readline is always called with two FILE* objects, anyway (stdin and stdout.) I would, however, expect such a problem to appear if a user created a Python thread in the interactive session that periodically printed to the terminal, then read input from the terminal. In that case, I would expect to see no output from the thread while the read operations were blocked, but I haven't tested it. (I don't remember if this came up after I applied my patch locally.)

    I actually considered this when I created the patch; however, I didn't feel like going to all the trouble of adding a member to tok and propagating the change throughout the entire core. I had hoped this bug would get more attention and I'd be able to discuss it with a developer involved in the Python project, but ultimately that didn't happen and I ended up forgetting about it.

    Kevin Barry

    @emmanuel
    Copy link
    Mannequin

    emmanuel mannequin commented Mar 18, 2013

    Kevin,

    These are good points.

    I had a cursory look at the python source code and observed the following:
    - There may also be a concern with stderr (used to print the prompt in PyOS_Readline)
    - PyOS_Readline has two different definitions in files pgenmain.c and myreadline.c
    - There is this interesting comment in myreadline.c:
    /* By initializing this function pointer, systems embedding Python can
       override the readline function.
       Note: Python expects in return a buffer allocated with PyMem_Malloc. */
    char *(*PyOS_ReadlineFunctionPointer)(FILE *, FILE *, char *);

    This pointer is actually used (set it to (void*)1 and the interpreter crashes) so it could offer a means to redirect stdin as we want. For stdout/stderr further investigation is needed.

    @emmanuel
    Copy link
    Mannequin

    emmanuel mannequin commented Mar 20, 2013

    Kevin,
    I've read more carefully your messages and investigated some more.
    It seems that there are several issues:
    1/ To take input from a defined tty without interfering with standard file descriptors
    2/ To have the result (object) of evaluation printed to a defined tty without interfering with standard file descriptors
    3/ (optionally) To direct to the tty (or not) the output that is a side effect of the evaluation
    Provided that no one messes with PyOS_ReadlineFunctionPointer (as with "import readline") it should be possible to solve 1 without modifying Python, and "approximately" solve 2 (with 3 implied out of necessity).
    On the other hand, modifying Python as you suggest could solve 1, but issues 2 and 3 would still remain and probably require some other modifications.

    @KevinBarry
    Copy link
    Mannequin Author

    KevinBarry mannequin commented Mar 24, 2013

    emmanuel,

    Regarding your points: All three can be taken care of with a combination of my patch and setting sys.stdin, sys.stdout, and sys.stderr to the pty. (That should really be done internally with another patch, since os.fdopen is OS-specific. Also, sys.stdin, sys.stdout, and sys.stderr should each have distinct underlying file descriptors, which I didn't do in working.c.) Those can safely be replaced since they're just the "effective" standard files, and sys.__stdin__ et al. refer to the actual C stdin et al. The remaining issue would be that the same descriptor shouldn't be used for both input and output in the interpreter loop, especially if the FILE* passed is only open for reading (since standard input technically doesn't have to be writable.)

    Kevin Barry

    @emmanuel
    Copy link
    Mannequin

    emmanuel mannequin commented Mar 25, 2013

    Kevin,
    I now fully agree with you. Regarding points 2 & 3 I dismissed modifying sys.stdin/out in python out of hand because it still would not allow to have a proper behaviour with two concurrent consoles on the same interpreter. Anyway this is not a very meaningful use-case, so modifying sys.stdin/out looks like the best solution.
    As for point 1, it can be worked around in a twisted and probably non-portable way (setting PyOS_ReadlineFunctionPointer to a custom function that forks a child which runs GNU readline or whatnot on its fds 0,1,2, and which communicates with the parent through pipes) but it's a pity that from PyOS_Readline() on, argument sys_stdin is correctly passed down, and that the chain has only this gap in tok_nextc() which dismisses the caller's argument and uses plain stdin.
    If the user arguments are not used, why does PyRun_InteractiveLoop() take any arguments at all?
    It would be nice to know the opinion of the python development team and to work out a complete fix (considering also stdout as you suggest) under their authority.

    @KevinBarry
    Copy link
    Mannequin Author

    KevinBarry mannequin commented Mar 25, 2013

    emmanuel,

    The Python interpreter isn't reentrant, so you could only run two interactive sessions connected to the same Python environment if you implemented your own REPL function that unlocked the GIL when waiting for input, then lock it just long enough to interpret the input.

    Also, the readline module already does what you suggest (sets PyOS_ReadlineFunctionPointer to a GNU libreadline wrapper.) My "readline hack" (working.c) forces it to behave as it's supposed to. Rather, it *undoes* what PyRun_InteractiveLoop does every iteration, which is pass stdin/stdout for I/O, which libreadline in turn uses.

    I agree that it would be nice to get a Python developer involved, mostly because I expect things to break when this problem is fixed. Bad things happen when you think you've tested a lot of different cases that actually turn out to be the exact same case.

    Kevin Barry

    @YauheniKaliuta
    Copy link
    Mannequin

    YauheniKaliuta mannequin commented Oct 3, 2015

    Any progress with the problem? I just wanted to use the feature, but it looks like the bug.sh is still reproduces the bug.

    @pmp-p
    Copy link
    Mannequin

    pmp-p mannequin commented Sep 9, 2020

    all PyRun_InteractiveOne* functions are also affected
    it is really annoying when implementing repl behaviour when embedding( use cases : pyodide, android, wasi )

    But I think the correct patch is :

    •        char \*newtok = PyOS_Readline(stdin, stdout, tok-\>prompt);
      

    + char *newtok = PyOS_Readline(tok->fp? tok->fp : stdin, stdout, tok->prompt);

    @pmp-p pmp-p mannequin added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes labels Sep 10, 2020
    @miss-islington
    Copy link
    Contributor

    New changeset 89b1304 by Paul m. p. P in branch 'main':
    bpo-14916: use specified tokenizer fd for file input (GH-31006)
    89b1304

    @miss-islington
    Copy link
    Contributor

    New changeset 91e8889 by Miss Islington (bot) in branch '3.10':
    bpo-14916: use specified tokenizer fd for file input (GH-31006)
    91e8889

    @hauntsaninja
    Copy link
    Contributor

    Thanks, looks like this has been fixed

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants