Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

race condition in initstdio() (python aborts running under nohup) #69079

Closed
yiding mannequin opened this issue Aug 18, 2015 · 19 comments
Closed

race condition in initstdio() (python aborts running under nohup) #69079

yiding mannequin opened this issue Aug 18, 2015 · 19 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@yiding
Copy link
Mannequin

yiding mannequin commented Aug 18, 2015

BPO 24891
Nosy @ronaldoussoren, @vstinner, @rbtcollins, @ned-deily, @mpaolini
Files
  • test.py
  • test.sh
  • issue24891.patch
  • issue24891_2.patch
  • issue24891_3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-09-04.15:35:10.873>
    created_at = <Date 2015-08-18.22:18:48.236>
    labels = ['interpreter-core', 'type-crash']
    title = 'race condition in initstdio() (python aborts running under nohup)'
    updated_at = <Date 2015-09-04.20:23:28.036>
    user = 'https://bugs.python.org/YiDing'

    bugs.python.org fields:

    activity = <Date 2015-09-04.20:23:28.036>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-09-04.15:35:10.873>
    closer = 'vstinner'
    components = ['Interpreter Core']
    creation = <Date 2015-08-18.22:18:48.236>
    creator = 'Yi Ding'
    dependencies = []
    files = ['40209', '40210', '40354', '40355', '40357']
    hgrepos = []
    issue_num = 24891
    keywords = ['patch']
    message_count = 19.0
    messages = ['248797', '248798', '248800', '248801', '248803', '248804', '248805', '248806', '249514', '249753', '249754', '249756', '249760', '249766', '249770', '249773', '249775', '249823', '249824']
    nosy_count = 7.0
    nosy_names = ['ronaldoussoren', 'vstinner', 'rbcollins', 'ned.deily', 'python-dev', 'mpaolini', 'Yi Ding']
    pr_nums = []
    priority = 'low'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue24891'
    versions = ['Python 3.4', 'Python 3.5', 'Python 3.6']

    @yiding
    Copy link
    Mannequin Author

    yiding mannequin commented Aug 18, 2015

    Looks like this bug https://bugs.python.org/issue7111 has resurfaced in python3 (python 2.6 works as far as I can tell) at least on Macs.

    I've attached a simple test script.

    Steps:

    1. SSH to remote server.
    2. Run nohup ./test.sh &
    3. exit SSH.
    4. SSH back in and see that there are a bunch of errors in your nohup.out file.

    @yiding yiding mannequin added the type-crash A hard crash of the interpreter, possibly with a core dump label Aug 18, 2015
    @yiding
    Copy link
    Mannequin Author

    yiding mannequin commented Aug 18, 2015

    test.sh attached

    @rbtcollins
    Copy link
    Member

    What sort of errors?

    @vstinner
    Copy link
    Member

    I'm unable to reproduce the issue on Linux with Python 3.4.2.

    @yiding
    Copy link
    Mannequin Author

    yiding mannequin commented Aug 18, 2015

    Fatal Python error: Py_Initialize: can't initialize sys standard streams
    OSError: [Errno 9] Bad file descriptor
    ./test.sh: line 4: 49632 Abort trap: 6 python3 test.py -o hello.txt

    @yiding
    Copy link
    Mannequin Author

    yiding mannequin commented Aug 18, 2015

    That's from my nohup.out. It might be a Mac OS specific thing.

    @vstinner
    Copy link
    Member

    I can reproduce the bug with gdb if the file descriptor 0 is closed before calling:
        std = create_stdio(iomod, fd, 0, "<stdin>", encoding, errors);
    and after the following lines were called:
        fd = fileno(stdin);
        if (!is_valid_fd(fd)) {

    In initstdio () at Python/pylifecycle.c:1156.

    create_stdio() fails in FileIO constructor, on the line:
    if (_Py_fstat(self->fd, &fdfstat) < 0)
    of _io_FileIO___init___impl() at Modules/_io/fileio.c:480

    It's a corner case of the issue bpo-7111, I would call it a race condition.

    @vstinner
    Copy link
    Member

    Python 2 is not affected, PyFile_FromFile() doesn't check if the file descriptor is valid. On Python 3, we call fstat() to check the block size of the file descriptor to optimize buffering.

    @vstinner vstinner changed the title python aborts running under nohup race condition in initstdio() (python aborts running under nohup) Aug 19, 2015
    @vstinner vstinner added interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed OS-mac labels Aug 19, 2015
    @yiding
    Copy link
    Mannequin Author

    yiding mannequin commented Sep 1, 2015

    I'm not sure this is a race condition. There's a crash every single time python is started when running the test.sh script under nohup.

    If it's a race condition, it should only happen some of the time, not every single time.

    Can you reeevaluate the priority?

    @mpaolini
    Copy link
    Mannequin

    mpaolini mannequin commented Sep 4, 2015

    Attaching a patch. Does it make any sense?

    @mpaolini
    Copy link
    Mannequin

    mpaolini mannequin commented Sep 4, 2015

    ops wrong patch... trying again.

    @vstinner
    Copy link
    Member

    vstinner commented Sep 4, 2015

    Yes, it looks like a good approach. Some comments:

    • please mention this issue "Issue bpo-24891" and explain the issue in a
      comment in create_stdio_checked()
    • it would be safer to check the exception type, to be extra safe
    • you must clear the exception
    • why not putting the code in create_stdio() directly?

    @mpaolini
    Copy link
    Mannequin

    mpaolini mannequin commented Sep 4, 2015

    @Haypo thanks for the quick review. This new issue24891_2.patch covers all of the points you raised except the "check exception type" which I am still figuring out.

    @vstinner
    Copy link
    Member

    vstinner commented Sep 4, 2015

    This new issue24891_2.patch covers all of the points you raised except the "check exception type" which I am still figuring out.

    See my patch issue24891_3.patch which calls PyErr_ExceptionMatches(PyExc_OSError). If you like it, I can push it to Python 3.4-3.6.

    @mpaolini
    Copy link
    Mannequin

    mpaolini mannequin commented Sep 4, 2015

    @Haypo, yeah, definitely better than mine! All good for me.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 4, 2015

    New changeset e67bf9c9a898 by Victor Stinner in branch '3.4':
    Fix race condition in create_stdio()
    https://hg.python.org/cpython/rev/e67bf9c9a898

    @vstinner
    Copy link
    Member

    vstinner commented Sep 4, 2015

    @Haypo, yeah, definitely better than mine! All good for me.

    Ok. I added your name to Misc/ACKS. I had to do some tricks to apply the patch to Python 3.4 (code was in Python/pythonrun.c) and then to merge to Python 3.5 (code moved to Python/pylifecycle.c). But I checked the fix in each version using gdb:

    • put a breakpoint on create_stdio,
    • type "print close(0)" after the first is_valid_fd() check,
    • see that the open() exception is correctly handled (type "next", "next", ... and check that we go to the error: label and then enter the if() block)

    Thanks for your patch Marco.

    I prefer to be extra safe by checking the raised exception to minimize the risk of regression.

    @vstinner vstinner closed this as completed Sep 4, 2015
    @yiding
    Copy link
    Mannequin Author

    yiding mannequin commented Sep 4, 2015

    Thank you everyone! I will test the next rc of 3.5.

    @vstinner
    Copy link
    Member

    vstinner commented Sep 4, 2015

    I don't plan to send a pull request to the Python 3.5 release manager.
    This bug existed since Python 3.0, the fix can wait for Python 3.5.1.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants