Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possible deadlock in python IO implementation #47868

Closed
vstinner opened this issue Aug 20, 2008 · 21 comments
Closed

possible deadlock in python IO implementation #47868

vstinner opened this issue Aug 20, 2008 · 21 comments
Labels
stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@vstinner
Copy link
Member

BPO 3618
Nosy @gpshead, @jcea, @pitrou, @vstinner, @benjaminp
Files
  • io_deadlock.patch: Use RLock instead of Lock + unit test for the bug
  • rlockio.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2009-03-05.01:02:19.794>
    created_at = <Date 2008-08-20.11:00:50.182>
    labels = ['library', 'type-crash']
    title = 'possible deadlock in python IO implementation'
    updated_at = <Date 2009-03-05.01:02:19.793>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2009-03-05.01:02:19.793>
    actor = 'benjamin.peterson'
    assignee = 'none'
    closed = True
    closed_date = <Date 2009-03-05.01:02:19.794>
    closer = 'benjamin.peterson'
    components = ['Library (Lib)']
    creation = <Date 2008-08-20.11:00:50.182>
    creator = 'vstinner'
    dependencies = []
    files = ['11190', '11374']
    hgrepos = []
    issue_num = 3618
    keywords = ['patch', 'needs review']
    message_count = 21.0
    messages = ['71537', '71538', '71541', '71543', '71545', '71607', '71625', '71626', '72496', '72507', '72510', '78099', '80789', '82924', '82933', '82934', '83136', '83169', '83171', '83175', '83177']
    nosy_count = 6.0
    nosy_names = ['gregory.p.smith', 'jcea', 'ggenellina', 'pitrou', 'vstinner', 'benjamin.peterson']
    pr_nums = []
    priority = 'low'
    resolution = 'wont fix'
    stage = None
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue3618'
    versions = ['Python 3.0']

    @vstinner
    Copy link
    Member Author

    BufferedWriter from Lib/io.py is thread-safe, but... the Python
    instruction "with self._write_lock:" could be interrupted when the
    lock is already acquired. Especially, _lsprof.Profiler() uses ceval
    hook and is called when the lock is acquired. But if the profiler (or
    any other module using the hook) re-use the stream (stdout or stderr),
    we get a deadlock!?

    Well, this problem is really not critical since only developers
    profilers (really?).

    Solution: use C implementation of Lib/io.py (from Python 2.6) to avoid
    ceval hook?

    Example with py3k trunk:
     >>> import _lsprof
     >>> prof=_lsprof.Profiler(42)
     >>> prof.enable()
     # _lsprof calls 42() 
     #   -> 42 is not callable
     #   -> call PyErr_WriteUnraisable(<42 is not callable>)
     #   -> deadlock

    Traceback example:

    #6 0x080ecb30 in PyThread_acquire_lock (lock=0x820f550, waitflag=1)
    at Python/thread_pthread.h:352
    (...)
    #8 0x08162521 in PyCFunction_Call (func=0x83abfbc, arg=0xb7dc6034,
    kw=0x0) at Objects/methodobject.c:81
    #9 0x080b3659 in call_function (pp_stack=0xbfbf9474, oparg=0) at
    Python/ceval.c:3403
    #10 0x080ae7a6 in PyEval_EvalFrameEx (f=0x83b9f94, throwflag=0) at
    Python/ceval.c:2205
    #11 0x080b3aae in fast_function (func=0x83a25b4, pp_stack=0xbfbf9724,
    n=2, na=2, nk=0) at Python/ceval.c:3491
    #12 0x080b37ef in call_function (pp_stack=0xbfbf9724, oparg=1) at
    Python/ceval.c:3424
    #13 0x080ae7a6 in PyEval_EvalFrameEx (f=0x83b981c, throwflag=0) at
    Python/ceval.c:2205
    #14 0x080b1aee in PyEval_EvalCodeEx (co=0x838a328, globals=0x8330534,
    locals=0x0, args=0x8373da0, argcount=2, kws=0x0, kwcount=0, defs=0x0,
    defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:2840
    #15 0x0814ab2e in function_call (func=0x83a3b8c, arg=0x8373d8c,
    kw=0x0) at Objects/funcobject.c:628
    #16 0x08118d19 in PyObject_Call (func=0x83a3b8c, arg=0x8373d8c,
    kw=0x0) at Objects/abstract.c:2181
    #17 0x08132eb0 in method_call (func=0x83a3b8c, arg=0x8373d8c, kw=0x0)
    at Objects/classobject.c:323
    #18 0x08118d19 in PyObject_Call (func=0x83037dc, arg=0x83141f4,
    kw=0x0) at Objects/abstract.c:2181
    #19 0x080b2ed8 in PyEval_CallObjectWithKeywords (func=0x83037dc,
    arg=0x83141f4, kw=0x0) at Python/ceval.c:3283
    #20 0x08141779 in PyFile_WriteObject (v=0x8398e28, f=0x83ab0dc,
    flags=1) at Objects/fileobject.c:164
    #21 0x08141974 in PyFile_WriteString (s=0x819a2f2 "Exception ",
    f=0x83ab0dc) at Objects/fileobject.c:189
    #22 0x080c473c in PyErr_WriteUnraisable (obj=0x81fbd78) at
    Python/errors.c:696
    #23 0xb7f9612f in CallExternalTimer (pObj=0x83a7aa8)
    at /home/haypo/prog/py3k/Modules/_lsprof.c:135
    #24 0xb7f96984 in Stop (pObj=0x83a7aa8, self=0x826c6d8,
    entry=0x826ec80) at /home/haypo/prog/py3k/Modules/_lsprof.c:337
    #25 0xb7f96c60 in ptrace_leave_call (self=0x83a7aa8, key=0x81cb150)
    at /home/haypo/prog/py3k/Modules/_lsprof.c:420
    #26 0xb7f96d5c in profiler_callback (self=0x83a7aa8, frame=0x835a0b4,
    what=6, arg=0x83ab92c) at /home/haypo/prog/py3k/Modules/_lsprof.c:471
    #27 0x080b28cb in call_trace (func=0xb7f96c85 <profiler_callback>,
    obj=0x83a7aa8, frame=0x835a0b4, what=6, arg=0x83ab92c)
    at Python/ceval.c:3090
    #28 0x080b35da in call_function (pp_stack=0xbfbf9d74, oparg=1) at
    Python/ceval.c:3403
    #29 0x080ae7a6 in PyEval_EvalFrameEx (f=0x835a0b4, throwflag=0) at
    Python/ceval.c:2205

    ceval hook: Python/ceval.3403:
       PCALL(PCALL_CFUNCTION);
       if (flags & (METH_NOARGS | METH_O)) {
          ...
       } else {
          PyObject *callargs;
          callargs = load_args(pp_stack, na);
          READ_TIMESTAMP(*pintr0);
          C_TRACE(x, PyCFunction_Call(func,callargs,NULL)); <= **here**
          READ_TIMESTAMP(*pintr1);
          Py_XDECREF(callargs);
       }

    @vstinner vstinner added stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump labels Aug 20, 2008
    @pitrou
    Copy link
    Member

    pitrou commented Aug 20, 2008

    Yes indeed. We could use an RLock to avoid the problem but RLock's are
    damn slow since they are written in pure Python (see bpo-3001). Rewriting
    the critical parts of RLock (constructor, acquire(), release(),
    __enter__(), __exit__()) in C should not be too complicated, would you
    want to do it? :)

    Solution: use C implementation of Lib/io.py (from Python 2.6) to avoid
    ceval hook?

    I guess it's out of question. However, Buffered{Reader,Writer,Random}
    should be rewritten in C one day, it is necessary for speed. Then the
    problem will vanish since a lock will only need to be taken when
    releasing the GIL, that is not when Python code is being interpreted.

    @vstinner
    Copy link
    Member Author

    @pitrou: I wrote an implementation of RLock in C (see bpo-3001). So it
    would be possible to use threading.RLock instead of threading.Lock ;-)

    @vstinner
    Copy link
    Member Author

    Oops, I forgot to update PyInit__Thread() with my new time:

    • Add PyType_Ready()
    • Register RLockType to threading dict

    Here is the new patch

    @vstinner
    Copy link
    Member Author

    Ooops again, I uploaded my patch to the wrong issue! The new patch is
    now in the issue bpo-3001.

    @gpshead
    Copy link
    Member

    gpshead commented Aug 21, 2008

    see the python-3000 thread I just started asking for opinions on this.
    I'd personally say this bug is a good reason to go ahead with bpo-3001 for 3.0.

    @vstinner
    Copy link
    Member Author

    So if we consider that RLock is fast enough (see my C version of RLokc
    in bpo-3001), we can use RLock instead of Lock to avoid this issue. Here
    is a patch to use RLock and also including an unit test of this issue.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 21, 2008

    Selon STINNER Victor <report@bugs.python.org>:

    So if we consider that RLock is fast enough (see my C version of RLokc
    in bpo-3001), we can use RLock instead of Lock to avoid this issue. Here
    is a patch to use RLock and also including an unit test of this issue.

    I tried your test and it seems to lock when using the Python implementation of
    RLock, but perhaps it's precisely because the implementation is in Python.

    Also, your RLock implementation has to be finished before going further ;)
    (see my comments in bpo-3001)

    @pitrou
    Copy link
    Member

    pitrou commented Sep 4, 2008

    We might as well bite the bullet and include a short, minimalist RLock
    implementation in io.py (so as not to pull threading and all its
    dependencies at startup). The C version of RLock will wait for 3.1.

    @pitrou
    Copy link
    Member

    pitrou commented Sep 4, 2008

    Here is a patch. The RLock implementation is naive and as simple as
    possible. It doesn't solve Haypo's case, probably because the tracing
    func kicks in in the RLock code itself.

    I don't want to make a decision on this alone, so someone please advise.

    @pitrou
    Copy link
    Member

    pitrou commented Sep 4, 2008

    (I have to add that the patch makes small reads about 60-80% slower)

    @vstinner
    Copy link
    Member Author

    I hope that this issue will be fixed by io-c (io library rewritten in
    C language).

    @vstinner
    Copy link
    Member Author

    The test (io_deadlock.patch) pass on the io-c branch \o/

    @benjaminp
    Copy link
    Contributor

    Yes, this is solved in the io-c branch. Antoine, do you think we should
    switch _pyio to use the RLock?

    @pitrou
    Copy link
    Member

    pitrou commented Feb 28, 2009

    I don't know. The RLock is a lot slower than the normal non-recursive
    variation, on the other hand I'm not sure we care about performance of
    the Python version that much. Opinions welcome.

    @benjaminp
    Copy link
    Contributor

    2009/2/28 Antoine Pitrou <report@bugs.python.org>:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    I don't know. The RLock is a lot slower than the normal non-recursive
    variation, on the other hand I'm not sure we care about performance of
    the Python version that much. Opinions welcome.

    I'm +0. The deadlock will only affect people specifically messing with
    the Python io implementation. It's low priority, so maybe we should
    just do it when RLock is rewritten in C.

    @benjaminp
    Copy link
    Contributor

    Since io-c has been merging, I'm lowering priority.

    @benjaminp benjaminp changed the title possible deadlock in IO library (Lib/io.py) possible deadlock in python IO implementation Mar 4, 2009
    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 5, 2009

    Since io-c has been merging, I'm lowering priority.

    Why not fixing this issue? The issue is rare (only occurs when using profiling
    with a callback writting to stdout) and you closed the issue bpo-4862 which is
    more common (read an UTF-16 file).

    @benjaminp
    Copy link
    Contributor

    2009/3/4 STINNER Victor <report@bugs.python.org>:

    STINNER Victor <victor.stinner@haypocalc.com> added the comment:

    > Since io-c has been merging, I'm lowering priority.

    Why not fixing this issue? The issue is rare (only occurs when using profiling
    with a callback writting to stdout) and you closed the issue bpo-4862 which is
    more common (read an UTF-16 file).

    See msg82934.

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 5, 2009

    Ooops, I wanted to write: "Why not *closing* this issue?", sorry.

    @benjaminp
    Copy link
    Contributor

    I suppose we might as well. If anyone wants to fix the Python
    implementation later, they can go ahead and reopen this.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants