Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_sock_connect_cb can be called twice resulting in InvalidStateError #69779

Closed
thehesiod mannequin opened this issue Nov 10, 2015 · 50 comments
Closed

_sock_connect_cb can be called twice resulting in InvalidStateError #69779

thehesiod mannequin opened this issue Nov 10, 2015 · 50 comments
Assignees
Labels
topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@thehesiod
Copy link
Mannequin

thehesiod mannequin commented Nov 10, 2015

BPO 25593
Nosy @gvanrossum, @asvetlov, @1st1, @thehesiod
Files
  • asyncio_invalid_state_bt.txt: Backtrace of my InvalidStateError exception
  • Issue25593_repro_client.py: Repro client script (requires server script)
  • Issue25593_repro_server.py: Server repro script
  • test_app.py
  • run_once_testfix_for_Issue25593.patch: Run loop._ready callback loops at start of _run_once
  • Issue25593_fix.patch: Patch submission for using stop flag
  • issue25593_revised.patch
  • issue25593_revised_2.patch
  • issue25593_revised_3.patch
  • issue25593_revised_4.patch
  • issue25593_revised_5.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gvanrossum'
    closed_at = <Date 2015-11-19.21:35:54.971>
    created_at = <Date 2015-11-10.07:16:34.030>
    labels = ['type-bug', 'expert-asyncio']
    title = '_sock_connect_cb can be called twice resulting in InvalidStateError'
    updated_at = <Date 2016-02-10.22:39:48.267>
    user = 'https://github.com/thehesiod'

    bugs.python.org fields:

    activity = <Date 2016-02-10.22:39:48.267>
    actor = 'vstinner'
    assignee = 'gvanrossum'
    closed = True
    closed_date = <Date 2015-11-19.21:35:54.971>
    closer = 'gvanrossum'
    components = ['asyncio']
    creation = <Date 2015-11-10.07:16:34.030>
    creator = 'thehesiod'
    dependencies = []
    files = ['41015', '41016', '41017', '41018', '41019', '41059', '41080', '41082', '41084', '41086', '41087']
    hgrepos = []
    issue_num = 25593
    keywords = ['patch']
    message_count = 50.0
    messages = ['254433', '254462', '254484', '254488', '254495', '254496', '254497', '254498', '254500', '254501', '254502', '254503', '254504', '254505', '254506', '254507', '254508', '254509', '254510', '254512', '254513', '254514', '254518', '254535', '254536', '254538', '254544', '254551', '254775', '254779', '254781', '254783', '254786', '254910', '254913', '254914', '254921', '254923', '254924', '254925', '254926', '254930', '254933', '254934', '254936', '254937', '259971', '259972', '260027', '260042']
    nosy_count = 6.0
    nosy_names = ['gvanrossum', 'asvetlov', 'python-dev', 'yselivanov', 'thehesiod', 'Justin Mayfield']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue25593'
    versions = ['Python 3.4', 'Python 3.5', 'Python 3.6']

    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Nov 10, 2015

    asyncio.selector_events.BaseSelectorEventLoop._sock_connect_cb is a callback based on the selector for a socket. There are certain situations when the selector triggers twice calling this callback twice, resulting in an InvalidStateError when it sets the Future to None. The way I triggered this was by having several parallel connections to the same host in a multiprocessing script. I suggest analyzing why this callback can be called twice and figuring out what the correct fix is. I monkey patched it by adding a fut.done() check at the top. If this information is not enough I can try to provide a sample script. Its currently reproducing in a fairly involved multiprocessing script.

    @thehesiod thehesiod mannequin added topic-asyncio type-bug An unexpected behavior, bug, or error labels Nov 10, 2015
    @gvanrossum
    Copy link
    Member

    Please show us how to repro -- there's no way we can figure out how this "impossible" event could happen in your code without understanding your code. Is it possible that multiprocessing forked your event loop or something similarly esoteric?

    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Nov 11, 2015

    Sorry for being obscure before, it was hard to pinpoint. I think I just figured it out! I had code like this in a subprocess:

    def worker():
        while True:
            obj = self.queue.get()
            # do work with obj using asyncio http module
    
    def producer():
        nonlocal self
        obj2 = self.queue.get()
        return obj2
    
    
    workers = []
    for i in range(FILE_OP_WORKERS):
        t = asyncio.ensure_future(worker())
        t.add_done_callback(op_finished)
        workers.append(t)
    
    while True:
        f = loop.run_in_executor(None, producer)
        obj = loop.run_until_complete(f)
    
        t = async_queue.put(obj)
        loop.run_until_complete(t)
    
    loop.run_until_complete(asyncio.wait(workers))

    where self.queue is a multiprocessing.Queue, and async_queue is an asyncio queue. The idea is that I have a process populating a multiprocessing queue, and I want to transfer it to an syncio queue while letting the workers do their thing.

    Without knowing the underlying behavior, my theory is that when python blocks on the multiprocessing queue lock, it releases socket events to the async http module's selectors, and then when the async loop gets to the selectors they're released again.

    If I switch the producer to instead use a queue.get_nowait and busy wait with asyncio.sleep I don't get the error...however this is not ideal is we're busy waiting.

    Thanks!

    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Nov 11, 2015

    I'm going to close this as I've found a work-around, if I find a better test-case I'll open a new bug.

    @thehesiod thehesiod mannequin closed this as completed Nov 11, 2015
    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Nov 11, 2015

    Actually, I just realized I had fixed it locally by changing the callback to the following:
    429 def _sock_connect_cb(self, fut, sock, address):
    430 if fut.cancelled() or fut.done():
    431 return

    so a fix is still needed, and I also verified this happens with python3.4 as well.

    @thehesiod thehesiod mannequin reopened this Nov 11, 2015
    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Nov 11, 2015

    clarification, adding the fut.done() check, or monkey patching:
    orig_sock_connect_cb = asyncio.selector_events.BaseSelectorEventLoop._sock_connect_cb
    def _sock_connect_cb(self, fut, sock, address):
    if fut.done(): return
    return orig_sock_connect_cb(self, fut, sock, address)

    @gvanrossum
    Copy link
    Member

    Sorry, the code you posted is still incomprehensible. E.g. I suppose your
    worker doesn't really have

    obj = self.queue.get()
    

    but rather something like

    obj = yield from async_queue.get()
    

    But in the end, even with that hypothesis, I can't explain what you're
    seeing, and I believe there is a bug related to bad mixing multiprocessing
    and asyncio in some code you're not showing, and your "fix" just masks the
    problem. Note that the code you posted doesn't touch sockets in any way,
    while the issue you're seeing is related to sockets. So there *must* be
    more to it.

    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Nov 11, 2015

    self.queue is not an async queue, as I stated above its a multiprocessing queue. This code is to multiplex a multiprocessing queue to a async queue.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 11, 2015

    I believe I'm seeing this bug in a non-threaded and non-forked env.

    System:
    OSX 10.11.1 (15B42)
    Python 3.5.0 (from brew install)

    I'm using aiohttp to create several dozens of HTTP connections to the same server (an async tornado web server). Nothing special is being done around the event loop creation (standard get_event_loop()). However in my system the event loop is frequently stopped, via ioloop.stop(), and restarted via ioloop.run_forever(). I'm not sure this is related to the issue yet, but it's worth mentioning.

    I can't provide simplified test code just yet, but I can reproduce in my env with nearly 100% odds when doing a full system test. Attached is a sample backtrace.

    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Nov 11, 2015

    Perhaps I'm doing something really stupid, but I was able to reproduce the two issues I'm having with the following sample script. If you leave the monkey patch disabled, you get the InvalidStateError, if you enable it, you get the ServerDisconnect errors that I'm currently seeing which I work-around with retries. Ideas?

    import asyncio
    import aiohttp
    import multiprocessing
    import aiohttp.server
    import logging
    import traceback
    
    # Monkey patching
    import asyncio.selector_events
    
    # http://bugs.python.org/issue25593
    if False:
        orig_sock_connect_cb = asyncio.selector_events.BaseSelectorEventLoop._sock_connect_cb
        def _sock_connect_cb(self, fut, sock, address):
            if fut.done(): return
            return orig_sock_connect_cb(self, fut, sock, address)
        asyncio.selector_events.BaseSelectorEventLoop._sock_connect_cb = _sock_connect_cb
    
    
    class HttpRequestHandler(aiohttp.server.ServerHttpProtocol):
        @asyncio.coroutine
        def handle_request(self, message, payload):
            response = aiohttp.Response(self.writer, 200, http_version=message.version)
            response.add_header('Content-Type', 'text/html')
            response.add_header('Content-Length', '18')
            response.send_headers()
            yield from asyncio.sleep(0.5)
            response.write(b'<h1>It Works!</h1>')
            yield from response.write_eof()
    
    
    def process_worker(q):
        loop = asyncio.get_event_loop()
        #loop.set_debug(True)
        connector = aiohttp.TCPConnector(force_close=False, keepalive_timeout=8, use_dns_cache=True)
        session = aiohttp.ClientSession(connector=connector)
        async_queue = asyncio.Queue(100)
    
        @asyncio.coroutine
        def async_worker(session, async_queue):
            while True:
                try:
                    print("blocking on asyncio queue get")
                    url = yield from async_queue.get()
                    print("unblocking on asyncio queue get")
                    print("get aqueue size:", async_queue.qsize())
                    response = yield from session.request('GET', url)
                    try:
                        data = yield from response.read()
                        print(data)
                    finally:
                        yield from response.wait_for_close()
                except:
                    traceback.print_exc()
    
        def producer(q):
            print("blocking on multiprocessing queue get")
            obj2 = q.get()
            print("unblocking on multiprocessing queue get")
            print("get qempty:", q.empty())
            return obj2
    
        def worker_done(f):
            try:
                f.result()
                print("worker exited")
            except:
                traceback.print_exc()
    
        workers = []
        for i in range(100):
            t = asyncio.ensure_future(async_worker(session, async_queue))
            t.add_done_callback(worker_done)
            workers.append(t)
    
        @asyncio.coroutine
        def doit():
            print("start producer")
            obj = yield from loop.run_in_executor(None, producer, q)
            print("finish producer")
    
            print("blocking on asyncio queue put")
            yield from async_queue.put(obj)
            print("unblocking on asyncio queue put")
            print("put aqueue size:", async_queue.qsize())
    
        while True:
            loop.run_until_complete(doit())
    
    
    def server():
        loop = asyncio.get_event_loop()
        #loop.set_debug(True)
    
        f = loop.create_server(lambda: HttpRequestHandler(debug=True, keep_alive=75), '0.0.0.0', '8080')
    
        srv = loop.run_until_complete(f)
        loop.run_forever()
    
    
    if __name__ == '__main__':
        q = multiprocessing.Queue(100)
    
        log_proc = multiprocessing.log_to_stderr()
        log_proc.setLevel(logging.DEBUG)
    
        p = multiprocessing.Process(target=process_worker, args=(q,))
        p.start()
    
        p2 = multiprocessing.Process(target=server)
        p2.start()
    
        while True:
            print("blocking on multiprocessing queue put")
            q.put("http://0.0.0.0:8080")
            print("unblocking on multiprocessing queue put")
    
            print("put qempty:", q.empty())

    @gvanrossum
    Copy link
    Member

    I wonder if the bug is in aiohttp? The code you show is still too complex
    to debug for me.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 12, 2015

    Attaching simplified test setup. It does take some doing to repro so the local async server is required to make it happen (for me). When I tried just pointing to python.org it would not repro in 100 iterations, but using a local dummy server repros 100% for me.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 12, 2015

    Attached server side of repro.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 12, 2015

    This code repros without aiohttp when pitted against the previously attached web server (again on OSX 10.11, mid-2012 MBPr).

    Admittedly this may seem very arbitrary but I have better reasons in my production code for stopping an IOLoop and starting it again (which seems to be important to the reproduction steps).

    import asyncio
    
    loop = asyncio.get_event_loop()
    
    def batch_open():
        for i in range(100):
            c = asyncio.ensure_future(asyncio.open_connection('127.0.0.1', 8080))
            c.add_done_callback(on_resp)
    
    def on_resp(task):
        task.result()
        loop.stop()
    
    loop.call_soon(batch_open)
    while True:
        loop.run_forever()

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 12, 2015

    Just reproduced on Linux, Fedora Core 23.

    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Nov 12, 2015

    attaching my simplified testcase and logged an aiohttp bug: aio-libs/aiohttp#633

    @gvanrossum
    Copy link
    Member

    Justin's repro provides a clue: when the event loop is stopped before all
    callbacks have been processed, when the loop is restarted the I/O selector
    is asked again to do its work, and it will report all the same sockets as
    ready. So then the callback will be put into the ready queue again (even
    though it's already there). Then the second call will find the future
    already done.

    I'm not sure how this explains Alexander's issue but it's probably
    something similar. We should carefully review the other I/O callbacks too
    -- most of them look like they don't mind being called spuriously, but
    there are a few others (_sock_recv, _sock_sendall, _sock_accept) that look
    like they check for fut.cancelled() and might be susceptible to the same
    bug.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 12, 2015

    Guido,

    Shouldn't this not be the case for level triggered polling? From looking at selectors it looks like these are always level triggered which means they should only event once.

    @gvanrossum
    Copy link
    Member

    I'm not an expert on this terminology but don't you have that backwards?
    Assume we're using select() for a second. If you ask select() "is this FD
    ready" several times in a row without doing something to the FD it will
    answer yes every time once the FD is ready. IIUC that's what
    level-triggered means, and that's what causes the bug.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 12, 2015

    Nevermind, in the case of writeablity it won't matter either way.

    --

    So in looking at tornado's ioloop they run the ready callbacks before calling poll(). So the callbacks can modify the poll set.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 12, 2015

    I'm attaching a patch that runs _ready callbacks at the start of _run_once. The style and implications are ranging so I leave it to you at this point.

    @gvanrossum
    Copy link
    Member

    Thanks, but I don't like the idea of that patch. It feels like a hack that makes it less likely that the issue occurs, but I don't feel we should rely on the callbacks being called before checking the selector again. There may be other reasons (perhaps a future modification to the code) why we might occasionally check the selector redundantly. IOW I think we should really ensure that all I/O callbacks are properly idempotent.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 12, 2015

    I don't believe this is a case of nonidempotent callbacks, unless you are referring to Future.set_result(), which by design can't be called twice. The callbacks are given an inconsistent opportunity to modify the poll set because of indeterminacy in the ioloop. That being said I understand your reluctance given the amount of turmoil this has but would argue that consistency with tornado is a powerful ally and that a model where any callback using call_soon will be guaranteed the opportunity to modify the poll set is a good thing.

    @gvanrossum
    Copy link
    Member

    I thought some more about this. The problem is due to stopping and
    restarting the loop, and that's also something that occurs in
    Alexander's example code, so I retract my accusation of aiohttp (and I
    don't think I need more investigation of his code).

    I recall that I thought a LOT about whether to run callbacks and then
    poll the selector or the other way around. The issue is that in the
    "steady state" it doesn't matter because the two would alternate
    either way; but in edge cases it does matter, as we've seen here.

    I worry about a scenario where a callback does something like this:

    def evil():
        loop.call_soon(evil)
        loop.stop()

    Then the following code would never poll the selector with your fix
    (let's assume there are active FDs):

    evil()
    while True:
        loop.run_forever()

    Using the existing strategy it would still poll the selector.

    Also, several tests fail with your patch -- I have to investigate those.

    All in all I think replacing fut.cancelled() with fut.done() may be
    the way to go.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 12, 2015

    Interesting.

    I was going to do an analysis what using _ready.appendleft() for adding selector events would do for that scenario. The idea being to consistently juxtapose exiting callbacks, selector events and new callbacks. However I think this just moves the pawn in this ioloop halting problem.

    Is it worth investigating a change to the stop mechanism instead? Instead of raising an exception in the middle of run_once, it could set a flag to be seen by run_forever(). This may avoid this class of problem altogether and ensure run_once is a fairly simple and predictable.

    @gvanrossum
    Copy link
    Member

    Yeah, I've thought about changing the stop() mechanism too. It might
    mean that some callbacks will be executed that are currently skipped
    though, if your proposal is to run all callbacks in self._ready
    (excluding new ones) and then just exit if the stop flag is set. I
    worry about how this would violate expectations. We should be able to
    get away with this, because PEP-3156 is carefully vague about exactly
    how soon the loop will stop: it promises that callbacks scheduled
    *before* stop() was called will run before the loop exist -- but it
    makes no promises either way about callbacks schedule after stop() is
    called.

    A less intrusive change to stop() would be to somehow mark the point
    in self._ready where stop is called, so stopping at the same point as
    with the old algorithm, except for one difference: if you call stop()
    multiple times, it currently leaves extra "markers" in self._ready,
    which must be consumed by calling run_forever() multiple times. This
    proposal would change the semantics of that case. Again, I don't think
    PEP-3156 prevents us from doing that.

    But I still think those callbacks should be fixed (Alexander's
    original fix, extended to a few other callbacks that use
    fut.cancelled()).

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 12, 2015

    Yes, that's what I was suggesting.

    Looking at tornado they do the stop between callbacks/matured-scheduled and events. That approach seems somewhat arbitrary to me at first glance but tornado is very mature and they usually have good reasons for what they do.

    The notion of always completing a cycle seems more apt to me; Ie. your first design.

    A compelling thought experiment for allowing stop() to be lazy is if a user could somehow know when stop() was going to run or when it had been run. The nature of ioloop programming prevents you from knowing when it will run and because stop() has no return handle/future/task a user can't actually know when it did run. Ie. there is no way to await/add_done_callback on it, so baring hacks that bookend a stop() with marker callbacks it should be, as you said, sufficiently vague to justify a (more) lazy effect.

    --

    I more or less agree on the s/cancelled/done/ changes. I'm using a similar monkey patch in my libraries to dance around this issue right now. I still don't exactly like the idea that code is written with an explicit expectation that it could be pending or cancelled, but then must also be inherently prepared for spurious done callbacks. This seems like a borderline contract violation by add_writer() and co. I suppose that add_writer() is primarily targeted at streams and the case of an EINTR in a socket connect() is a more a one-shot. Tough call.

    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Nov 12, 2015

    btw want to thank you guys for actively looking into this, I'm very grateful!

    @gvanrossum
    Copy link
    Member

    Thinking about this more I believe it's possible for any of the FD callbacks in selector_events.py to be placed into loop._ready multiple times if the loop is stopped after the FD is ready (and the callback is scheduled) but before the callback is called. In all cases such a scenario results in the same callback (with the same future) being scheduled twice; the first call will call fut.set_result() and then the second call, if the FD is (still, or again) ready, will fail calling fut.set_result() on the same Future.

    The reason we've only seen reports of this for _sock_connect_cb() is probably that the other calls are all uncommon -- you have to explicitly call loop.sock_accept(), loop.sock_recv(), or loop.sock_sendall(), which is not the usual (or recommended) idiom. Instead, most people use Transports and Protocols, which use a different API, and create_server() doesn't use sock_accept(). But create_connection() *does* call sock_connect(), so that's used by everybody's code.

    I think the discussed change to stop() to set a flag that is only checked after all the ready for-loop is done might work here -- it guarantees that all I/O callbacks get to run before the selector is polled again. However, it requires that an I/O callback that wants to modify the selector in order to prevent itself from being called must do so itself, not schedule some other call that modifies the selector. That's fine for the set of I/O callbacks I've looked at.

    I just don't feel comfortable running the ready queue before polling the selector, since a worst-case scenario could starve the selector completely (as I sketched before -- and the proposed modification to stop() doesn't directly change this).

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 17, 2015

    +1

    Let me know what I can do to help.

    @gvanrossum
    Copy link
    Member

    @justin: Do you want to come up with a PR for the stop() changes?
    Hopefully including tests (I bet at least one test will fail -- our
    tests in generally are pretty constraining).

    On Mon, Nov 16, 2015 at 6:32 PM, Justin Mayfield <report@bugs.python.org> wrote:

    Justin Mayfield added the comment:

    +1

    Let me know what I can do to help.

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue25593\>


    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 17, 2015

    You bet.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 17, 2015

    Attached patch submission for stop flag proposal. I assume you didn't mean a github PR since the dev docs seem to indicate that is for readonly usage.

    This passes all the tests on my osx box but it should obviously be run by a lot more folks.

    @gvanrossum
    Copy link
    Member

    I'm going to fix up the patch and apply it so this can make 3.5.1 rc1.

    @gvanrossum
    Copy link
    Member

    Here's a better patch.

    • Renamed _stopped to _stopping.
    • Restore test_utils.run_once() and add a test for it.
    • Change logic so if _stopping is True upon entering run_forever(), it will run once.

    Please try it out!!

    @gvanrossum
    Copy link
    Member

    Here's the file.

    @gvanrossum
    Copy link
    Member

    New patch. Update test_utils.run_once() to use the recommended idiom. On second thought I don't like issuing a warning when stop() is called before the loop runs -- a warning seems overkill for something so minor. But I'm okay with no longer recommending the idiom.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 19, 2015

    I should have commented more on the run_once removal. The depiction given in its docstring seemed inconsistent with the new way stop works and I found no callers, so it seemed like it was best left out to avoid confusion. No worries though, I didn't get to know that test module very well before messing with it. It just came up in my scan for stop() callers.

    Looks good, I've applied to a 3.5.0 build and will include it in my testing from now on.

    Thanks Guido.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 19, 2015

    Ha, email race.

    Regarding rev 2, the updated docstring and scheduled stop looks good along with alleviating the confusion I mentioned.

    I'm not sure about your warning comment; Perhaps that's a patch I didn't lay eyes on.

    Cheers.

    @gvanrossum
    Copy link
    Member

    No, I mentioned the idea of a warning in the thread on the
    python-tulip mailing list, but decided not to do it after all.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Nov 19, 2015

    I see. Seems like good discussion over there. I joined up.

    @gvanrossum
    Copy link
    Member

    OK, here's another revision of the patch, setting the timeout passed to the selector to 0 when the loop is pre-stopped.

    @gvanrossum
    Copy link
    Member

    OK, another revision, keep the mock selector.

    @gvanrossum
    Copy link
    Member

    Whoops. Hopefully this one's right.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 19, 2015

    New changeset 9b3144716d17 by Guido van Rossum in branch '3.4':
    Issue bpo-25593: Change semantics of EventLoop.stop().
    https://hg.python.org/cpython/rev/9b3144716d17

    New changeset 158cc5701488 by Guido van Rossum in branch '3.5':
    Issue bpo-25593: Change semantics of EventLoop.stop(). (Merge 3.4->3.5)
    https://hg.python.org/cpython/rev/158cc5701488

    New changeset 2ebe03a94f8f by Guido van Rossum in branch 'default':
    Issue bpo-25593: Change semantics of EventLoop.stop(). (Merge 3.5->3.6)
    https://hg.python.org/cpython/rev/2ebe03a94f8f

    @gvanrossum
    Copy link
    Member

    Hopefully this is it!

    @gvanrossum gvanrossum self-assigned this Nov 19, 2015
    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Feb 10, 2016

    I'm not sure if you guys are still listening on this closed bug but I think I've found another issue ;) I'm using python 3.5.1 + asyncio 3.4.3 with the latest aiobotocore (which uses aiohttp 0.21.0) and had two sessions (two TCPConnectors), one doing a multitude of GetObjects via HTTP1.1, and the other doing PutObject, and the PutObject session returns error 61 (connection refused) from the same _sock_connect_cb. It feels like a similar issue to the original. I'll see if I can get small testcase.

    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Feb 10, 2016

    update: its unrelated to the number of sessions or SSL, but instead to the number of concurrent aiohttp requests. When set to 500, I get the error, when set to 100 I do not.

    @JustinMayfield
    Copy link
    Mannequin

    JustinMayfield mannequin commented Feb 10, 2016

    Alexander,

    That sounds unrelated. I'd treat it as a new issue until you have concrete evidence to the contrary.

    Also on face value it sounds like it might just be your operating systems open file limit. On OSX I think the default open file limit is in the hundreds (256 on my box). Generally on unix-oid platforms it can be checked and changed with the ulimit -n command.

    Cheers

    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Feb 10, 2016

    sorry for disruption! ends up our router seems to be doing some kind of QoS limits on # of connections :(

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-asyncio type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant