Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent behaviour of multiprocessing.Queue() if sem_open is not implemented #67589

Closed
olebole mannequin opened this issue Feb 6, 2015 · 19 comments
Closed

Inconsistent behaviour of multiprocessing.Queue() if sem_open is not implemented #67589

olebole mannequin opened this issue Feb 6, 2015 · 19 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@olebole
Copy link
Mannequin

olebole mannequin commented Feb 6, 2015

BPO 23400
Nosy @brettcannon, @ncoghlan, @embray, @ericsnowcurrently, @berkerpeksag, @serhiy-storchaka, @applio
Files
  • issue23400_py35_and_py34.patch
  • issue23400_py27.patch
  • issue23400_py35_and_py34_improveddocs.patch: superceded -- ignore this one
  • issue23400_py27_improveddocs.patch: Patch for 2.7, replaces all prior patches for 2.x.
  • issue23400_py35_and_py34_improveddocsandstyle.patch: Superceded -- ignore this one
  • issue23400_py35_and_py34_furtherimproveddocsandstyle.patch: Improved patch for default/3.5 and 3.4 (both), replaces all prior patches for 3.x.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-04-08.15:15:17.350>
    created_at = <Date 2015-02-06.09:03:19.349>
    labels = ['type-bug', 'library']
    title = 'Inconsistent behaviour of multiprocessing.Queue() if sem_open is not implemented'
    updated_at = <Date 2015-04-08.15:15:17.349>
    user = 'https://bugs.python.org/olebole'

    bugs.python.org fields:

    activity = <Date 2015-04-08.15:15:17.349>
    actor = 'berker.peksag'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-04-08.15:15:17.350>
    closer = 'berker.peksag'
    components = ['Library (Lib)']
    creation = <Date 2015-02-06.09:03:19.349>
    creator = 'olebole'
    dependencies = []
    files = ['38077', '38078', '38174', '38175', '38306', '38375']
    hgrepos = []
    issue_num = 23400
    keywords = ['patch']
    message_count = 19.0
    messages = ['235474', '235529', '235530', '235665', '235667', '236010', '236191', '237083', '237174', '237344', '237346', '237370', '237388', '237400', '237415', '237451', '240268', '240270', '240271']
    nosy_count = 11.0
    nosy_names = ['brett.cannon', 'ncoghlan', 'jnoller', 'python-dev', 'erik.bray', 'sbt', 'eric.snow', 'berker.peksag', 'serhiy.storchaka', 'davin', 'olebole']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue23400'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @olebole
    Copy link
    Mannequin Author

    olebole mannequin commented Feb 6, 2015

    On Debian Hurd, there is no sem_open implementation. When I try there to create a Queue, I get different errors, depending on the Python version:

    import multiprocessing
    q = multiprocessing.Queue()

    gives on Python 2.7.9:

    maxsize = 0
    
        def Queue(maxsize=0):
            '''
            Returns a queue object
            '''
    >       from multiprocessing.queues import Queue

    /usr/lib/python2.7/multiprocessing/init.py:217:


        __all__ = ['Queue', 'SimpleQueue', 'JoinableQueue']
        
        import sys
        import os
        import threading
        import collections
        import time
        import atexit
        import weakref
        
        from Queue import Empty, Full
        import _multiprocessing
        from multiprocessing import Pipe
    >   from multiprocessing.synchronize import Lock, BoundedSemaphore, Semaphore, Condition

    /usr/lib/python2.7/multiprocessing/queues.py:48:


    'Lock', 'RLock', 'Semaphore', 'BoundedSemaphore', 'Condition', 'Event'
    ]
    
    import threading
    import os
    import sys
    
    from time import time as \_time, sleep as \_sleep
    
    import \_multiprocessing
    from multiprocessing.process import current_process
    from multiprocessing.util import Finalize, register_after_fork, debug
    from multiprocessing.forking import assert_spawning, Popen
    
    \# Try to import the mp.synchronize module cleanly, if it fails
    \# raise ImportError for platforms lacking a working sem_open implementation.
    \# See bpo-3770
    try:
    from \_multiprocessing import SemLock
    except (ImportError):
    raise ImportError("This platform lacks a functioning sem_open" +
                      " implementation, therefore, the required" +
                      " synchronization primitives needed will not" +
    
                    " function, see bpo-3770.")
    

    E ImportError: This platform lacks a functioning sem_open implementation, therefore, the required synchronization primitives needed will not function, see bpo-3770.

    /usr/lib/python2.7/multiprocessing/synchronize.py:59: ImportError

    and on 3.4.2:

    self = <multiprocessing.context.DefaultContext object at 0x1449b0c>, maxsize = 0

        def Queue(self, maxsize=0):
            '''Returns a queue object'''
            from .queues import Queue
    >       return Queue(maxsize, ctx=self.get_context())

    /usr/lib/python3.4/multiprocessing/context.py:101:


    self = <multiprocessing.queues.Queue object at 0x511e8cc>, maxsize = 0

        def __init__(self, maxsize=0, *, ctx):
            if maxsize <= 0:
    >           maxsize = _multiprocessing.SemLock.SEM_VALUE_MAX
    E           AttributeError: 'module' object has no attribute 'SemLock'

    /usr/lib/python3.4/multiprocessing/queues.py:38: AttributeError

    Both traces are actually copied from Debian build logs:
    2.7.9: https://buildd.debian.org/status/fetch.php?pkg=python-astropy&arch=hurd-i386&ver=1.0\~rc1-1&stamp=1422724845
    3.4.2: https://buildd.debian.org/status/fetch.php?pkg=python-astropy&arch=hurd-i386&ver=1.0\~rc1-2&stamp=1423153281

    Neither for 2.7.9 nor for 3.4.2 this behaviour is documented. Also, I would expect to have a NotImplementedError and not an ImportError or an AttributeError in such a case.

    Please:

    • document the behaviour of multiprocessing.Queue() in the case of an absent sem_open for both Python2 and Python3

    • Consider raising NotImplementedError or being consistent in some other way.

    @olebole olebole mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Feb 6, 2015
    @applio
    Copy link
    Member

    applio commented Feb 7, 2015

    Having installed a fresh copy of Debian Hurd into a VM, I am able to reproduce the described issue using this 2-line snippet of code:

        import multiprocessing
        q = multiprocessing.Queue()

    It was possible to reproduce the issue both using the builds of 2.7.9 and 3.4.2 that came with Debian Hurd and with clean builds of each from the Python source code for those releases.

    Although already described in the original issue report, I'm taking the venerable astropy package out of the picture with this bare-bones provocation of the issue here. So running the above code snippet with 2.7.9 produces an ImportError and with 3.4.2 produces an AttributeError (details of error message are as previously described).

    I agree that the behavior should be consistent between 2.7 and 3.x -- I'm leaning towards the consistent use of ImportError because that is what's used when threading support isn't available on a platform (i.e. 'import thread' or 'import threading' fails with an ImportError there).

    As to where this should be documented in the documentation itself so that readers have a hope of finding it... that requires a bit more thought but I agree that it should be noted in the documentation. Thinking in pragmatic terms, most folks will probably not read the docs and instead notice it at the time of the failing import so the error message should be clear and consistent there.

    It does not look like there is a working implementation of sem_open() on Debian Hurd yet -- is that correct? Otherwise there's another more attractive potential fix for this.

    @applio
    Copy link
    Member

    applio commented Feb 7, 2015

    Apologies -- it was already pointed out that there is no sem_open implementation on Hurd.

    @applio
    Copy link
    Member

    applio commented Feb 10, 2015

    Attached are proposed patches for default (3.5), 3.4, and 2.7 branches. (The patch for 3.4 is identical to that for 3.5 so there are only two files in total being attached.)

    Regarding the exception being raised:

    • An ImportError is now consistently being raised across Python versions. This is believed to be a more consistent behavior than using NotImplementedError but it is debatable whether all such situations might be switched to instead use NotImplementedError in the future.
    • Implementation is done via a try-except around the problematic attempt to use _multiprocessing.SemLock inside the multiprocessing.queue module; it is believed this change has minimal risk.
    • The text of the ImportError message differs only slightly from that used in the exception highlighted in bpo-3770. In 2.7, the code raising this exception is encountered in both code execution paths (as described in bpo-3770 and as described here) but in 3.x various changes have broken this shared execution path. After staring at it for quite some time, I believe a less-than-simple refactoring is required to get both execution paths to encounter the same exception -- I'm punting on that for now. Ultimately, I believe the goal was not to leave the error message inspired by bpo-3770 but instead to clean it up and eliminate it as part of further code improvement efforts. I believe doing so is well beyond the scope of this issue but is still something that deserves addressing.

    Regarding the documentation of the exception:

    • A note has been added to the "Reference" section's "Pipes and Queues" subsection, describing the potential for and reasons behind an ImportError being encountered.
    • The note that previously appeared in the "Introduction" section's "Synchronization between processes" subsection (introduced per bpo-3770) has been relocated to similarly appear in the "Reference" section's "Synchronization primitives" subsection. Such a note appearing in an introduction section serves as a distraction to the reader who is hopefully learning about key concepts; it is better located in the formal documentation around the synchronization primitives where caveats and details should be conveyed. Making this change (relocation) keeps the two explanations of the ImportError arising from different places in the code better in parallel / symmetry with one another.

    Running through the complete battery of tests on Debian Hurd was not possible as the sequence of tests could never complete (on the default/3.5 branch at least) -- various tests were observed to hang and never complete. While Debian Hurd may not be a mainstream supported platform for Python, given its lack of important functionality such as a working sem_open implementation, it is still rather interesting as a testing platform to see how things behave when the underlying system is unable to provide numerous chunks of key functionality to Python.

    @applio
    Copy link
    Member

    applio commented Feb 10, 2015

    To be clear, the changes to 2.7 are exclusively in the documentation. Changes to 3.4 and default (3.5) are in both documentation and code.

    @berkerpeksag
    Copy link
    Member

    Thanks for the patch, Davin. I left a couple of comments on Rietveld.

    @applio
    Copy link
    Member

    applio commented Feb 18, 2015

    Attaching revised patches with improved phrasing in the changes to the docs.

    Thanks goes to Berker for reviewing and providing helpful feedback, especially paying close attention to detail in the docs.

    @applio
    Copy link
    Member

    applio commented Mar 2, 2015

    Attaching a further revised patch for 3.4 and 3.5/default, now with cleaned-up code style per the discussion in the review.

    No changes are prompted for the patch to 2.7 as it only contained changes to the docs and nothing code-related.

    Thanks again goes to Berker for the guidance.

    @berkerpeksag
    Copy link
    Member

    LGTM

    @serhiy-storchaka
    Copy link
    Member

    I'm not sure that this issue is worth to fix. AttributeError is not worse than ImportError, and users of 3.x expect AttributeError (if expect).

    @serhiy-storchaka
    Copy link
    Member

    But if this change is worth, I would write it as

    •        maxsize = \_multiprocessing.SemLock.SEM_VALUE_MAX
      

    + # Can raise ImportError (see issues bpo-3770 and bpo-23400)
    + from .synchronize import SEM_VALUE_MAX as maxsize

    This would avoid a duplication of the error message.

    @applio
    Copy link
    Member

    applio commented Mar 6, 2015

    Please keep in mind that this issue should only be encountered by people using Python 3.x on a platform like Hurd (an unsupported platform) that has no working sem_open implementation. If we try to imagine what Python 3.x users should expect, it's that they should see consistent behavior when importing or trying to use various pieces of multiprocessing on such an incomplete system. Or we should imagine Python 2.x users attempting to adopt Python 3.x and choosing to do so on such an incomplete system -- again, consistency is to be expected or at least desperately wished for. It is reasonable to believe that it is the exception that people like Ole have even seen this behavior and their reaction is that of a request for consistency. This is worth fixing.

    Adding the "from .synchronize import SEM_VALUE_MAX as maxsize" as was suggested would indeed trigger the ImportError but it would also trigger this ImportError immediately upon simply doing an "import multiprocessing" which is not the current behavior on systems like Hurd. This would effectively cut off those Python 3.x users from other functionality in multiprocessing which does not depend upon a working sem_open implementation.

    It'd make for a nice, tidy solution, that "from .synchronize ..." -- but I wasn't prepared to alter other expected behaviors. Hence the earlier comment about, "I believe a less-than-simple refactoring is required to get both execution paths to encounter the same exception".

    If we don't do this for Hurd users (they need love too), we should do it for users like Ole who sometimes find themselves working on a cool package like astropy on a system like Hurd.

    @serhiy-storchaka
    Copy link
    Member

    Please keep in mind that this issue should only be encountered by people
    using Python 3.x on a platform like Hurd (an unsupported platform) that has
    no working sem_open implementation.

    Yes, and I compiled Python with disabled sem_open for testing.

    Python not always specifies what exact type of exception is raised. When you
    pass wrong type of argument to a function, it can raise TypeError or
    AttributeError. If you request unsupported feature, it can raise ImportError,
    AttributeError, or even NameError. Python and C implementations can raise
    different errors (C implementation usually more strict). Not all such cases are
    considered as bugs.

    Adding the "from .synchronize import SEM_VALUE_MAX as maxsize" as was
    suggested would indeed trigger the ImportError but it would also trigger
    this ImportError immediately upon simply doing an "import multiprocessing"
    which is not the current behavior on systems like Hurd.

    No, I suggested to replace only the line that produces AttributeError now.

    @applio
    Copy link
    Member

    applio commented Mar 6, 2015

    No, I suggested to replace only the line that produces AttributeError now.

    Ah! Sorry, I misunderstood and incorrectly assumed you were imagining the import to happen at the top of the module.

    I must confess I am hesitant about the idea of putting an import inside the Queue.__init__ (or any method) because I generally don't consider it a best practice -- that said, there are a number of places within (not only) the multiprocessing module where imports are performed dynamically as part of implementations for various methods, but it bothers me when I see those as well. Yet your solution is simple and offers the benefit of not having the same (or nearly the same) text appear twice in the code.

    Berker: Do you have any strong feeling on this idea? If not, I am tempted to bend on my instinctive reaction here and go with Serhiy's style.

    @berkerpeksag
    Copy link
    Member

    If not, I am tempted to bend on my instinctive reaction here and go with Serhiy's style.

    +1 to Serhiy's suggestion.

    @applio
    Copy link
    Member

    applio commented Mar 7, 2015

    Attaching updated single patch for both default/3.5 and 3.4 to use Serhiy's insightful simplification.

    This updated patch has been tested on current Debian Hurd (see earlier comments about full battery of tests there) and OS X 10.10 (full, verbose tests).

    Note that the patch for 2.7 does not need a similar update as it contains purely documentation and no code changes.

    Thanks goes to Serhiy and Berker for their thoughtful reviews and follow-ups.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 8, 2015

    New changeset 749fd043de95 by Berker Peksag in branch '3.4':
    Issue bpo-23400: Raise same exception on both Python 2 and 3 if sem_open is not available.
    https://hg.python.org/cpython/rev/749fd043de95

    New changeset a49737bd6086 by Berker Peksag in branch 'default':
    Issue bpo-23400: Raise same exception on both Python 2 and 3 if sem_open is not available.
    https://hg.python.org/cpython/rev/a49737bd6086

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 8, 2015

    New changeset c2f6b3677630 by Berker Peksag in branch '2.7':
    Issue bpo-23400: Add notes about the sem_open support of the host OS to
    https://hg.python.org/cpython/rev/c2f6b3677630

    @berkerpeksag
    Copy link
    Member

    Fixed. Thank you all (and sorry for my late commit, Davin).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants