classification
Title: 100% cpu usage when using asyncore with UNIX socket
Type: resource usage Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Alexey.Agapitov, Benjamin.Ash, giampaolo.rodola, haypo, neologix, python-dev, rosslagerwall
Priority: normal Keywords: patch

Created on 2011-07-05 17:25 by Alexey.Agapitov, last changed 2013-02-04 19:32 by Benjamin.Ash. This issue is now closed.

Files
File name Uploaded Description Edit
asyncore_test.py Alexey.Agapitov, 2011-07-05 17:25 asyncore test
asyncore_unix_socket.diff neologix, 2011-07-05 19:18 review
Messages (16)
msg139898 - (view) Author: Алексей Агапитов (Alexey.Agapitov) Date: 2011-07-05 17:25
When using asyncore server with UNIX socket, I got 100% CPU usage.
I run modified code example from asyncore doc page.
This code was tested on two systems:

Ubuntu 10.04 2.6.32-32-generic #62-Ubuntu SMP

with two versions of Python:
Python 3.2 (r32:88445, Mar 29 2011, 08:55:36)

Python 3.2.1rc2 (default, Jul  5 2011, 20:33:19) Built from sources

and
Gentoo 2.6.36-hardened-r9 #6 SMP

with Python 3.1.3 (r313:86834, Mar 12 2011, 20:06:24)

I'm not sure, maybe it's because of the characteristics of UNIX socket?
msg139899 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-07-05 19:18
It's looping in Lib/asyncore.py:poll

select(4, [3], [3], [3], {30, 0})       = 1 (out [3], left {29, 999994})
select(4, [3], [3], [3], {30, 0})       = 1 (out [3], left {29, 999994})
select(4, [3], [3], [3], {30, 0})       = 1 (out [3], left {29, 999994})

loop sets the Unix domain socket in the writable set, and contrarily to AF_INET/AF_INET6 sockets, bound AF_UNIX SOCK_STREAM sockets are reported as writable before any client connects to them, which triggers the loop.

I've attached a patch which just doesn't add the socket to the writable set if it's in the accepting state.
It fixes the loop, and doesn't seem to cause any regression in test_asyncore, but since it's the first time I'm looking at asyncore's code, I might very well have missed something :-)
msg139900 - (view) Author: Ross Lagerwall (rosslagerwall) (Python committer) Date: 2011-07-05 19:31
Looks good, the patch seems to fix the problem.
This section of code indicates that the accepting socket shouldn't be in the write set...

def handle_write_event(self):
    if self.accepting:
        # Accepting sockets shouldn't get a write event.
        # We will pretend it didn't happen.
        return
msg140016 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-07-08 07:10
> Looks good, the patch seems to fix the problem.

Alright.
Giampaolo, do you agree?
Also, I noticed that test_asyncore currently only tests AF_INET sockets.
I'm working a patch to add AF_UNIX (and maybe AF_INET6) tests.
msg140018 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2011-07-08 07:26
Sorry for delay. Yes, patch looks ok to me and some tests for AF_UNIX sockets would definitively be welcome.
msg140345 - (view) Author: Roundup Robot (python-dev) Date: 2011-07-14 17:49
New changeset 16bc59d37866 by Charles-François Natali in branch '2.7':
Issue #12502: asyncore: fix polling loop with AF_UNIX sockets.
http://hg.python.org/cpython/rev/16bc59d37866
msg140346 - (view) Author: Roundup Robot (python-dev) Date: 2011-07-14 17:54
New changeset 42ec507815d2 by Charles-François Natali in branch '3.1':
Issue #12502: asyncore: fix polling loop with AF_UNIX sockets.
http://hg.python.org/cpython/rev/42ec507815d2
msg140347 - (view) Author: Roundup Robot (python-dev) Date: 2011-07-14 17:57
New changeset ed90c1c8ee62 by Charles-François Natali in branch '3.2':
Merge - Issue #12502: asyncore: fix polling loop with AF_UNIX sockets.
http://hg.python.org/cpython/rev/ed90c1c8ee62
msg140350 - (view) Author: Roundup Robot (python-dev) Date: 2011-07-14 18:00
New changeset ca077f2672e3 by Charles-François Natali in branch 'default':
Merge - Issue #12502: asyncore: fix polling loop with AF_UNIX sockets.
http://hg.python.org/cpython/rev/ca077f2672e3
msg140390 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-07-15 07:30
Woops, I don't know how, but I removed a neologix's message:

"Patch committed, closing.
Alexey, thanks for reporting this.
(I'll open a separate issue to increase the tests coverage)."
msg181087 - (view) Author: Benjamin Ash (Benjamin.Ash) Date: 2013-02-01 16:35
Hi,

I am still seeing this issue even with the patches applied. I see 100% CPU utilization, and strace shows the process is in same kind of select() loop as msg139899.

Any help would be greatly appreciated.

Thanks,

-ben

strace output:
"""
select(4, [3], [3], [3], {30, 0})       = 1 (out [3], left {29, 999998})
select(4, [3], [3], [3], {30, 0})       = 1 (out [3], left {29, 999998})
select(4, [3], [3], [3], {30, 0})       = 1 (out [3], left {29, 999998})
select(4, [3], [3], [3], {30, 0})       = 1 (out [3], left {29, 999998})
"""
msg181260 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-02-03 11:15
I'm unable to reproduce it.
Are you using the attached script?
Did you apply the patch by hand, or are you using a recent Python version?
msg181340 - (view) Author: Benjamin Ash (Benjamin.Ash) Date: 2013-02-04 16:08
Hi Charles-François,

I am using a recent version of Python-2.7 that does in fact contains this patch http://hg.python.org/cpython/rev/16bc59d37866:
  python-2.7.3-4.fc16.x86_64 (Fedora 16)

The CPU usage spikes after I make the initial client connection to the socket.

I am able to reproduce the issue using the attached script 'asyncore_test.py', and any of the following:
 - simple asyncore.dispatcher_with_send client.
 - simple asyncore.dispatcher client.
 - a direct socket.socket() connection.

Not sure what might be going wrong here. Any help would be most appreciated.

Thanks,

-ben
msg181346 - (view) Author: Benjamin Ash (Benjamin.Ash) Date: 2013-02-04 16:31
After doing a bit more testing, I was able to prevent the problem from occurring in asyncore_test.py with the following patch:

--- /proc/self/fd/11	2013-02-04 11:24:41.298347199 -0500
+++ asyncore_test.py	2013-02-04 11:24:40.393318513 -0500
@@ -19,10 +19,18 @@
         self.bind(sock)
         self.listen(5)
 
-    def handle_accepted(self, sock, addr):
-        print('Incoming connection from %s' % repr(addr))
-        handler = EchoHandler(sock)
+    def handle_accept(self):
+        pair = self.accept()
+        if pair is not None:
+            (sock, addr) = pair
+            print('Incoming connection from %s' % repr(addr))
+            handler = EchoHandler(sock)

Using handle_accept() in my code and remembering to call listen() in my asyncore.dispatcher server's constructor did the trick.

I am not sure if we still have a bug here though, since if the subclass doesn't define  a proper handle_accept() we get into the select() loop and 100% CPU utilization after the initial client connection.
msg181364 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-02-04 18:28
> Using handle_accept() in my code and remembering to call listen() in my
> asyncore.dispatcher server's constructor did the trick.
>
> I am not sure if we still have a bug here though, since if the subclass
> doesn't define  a proper handle_accept() we get into the select() loop and
> 100% CPU utilization after the initial client connection.

No, it's not a bug.
The attached test case was for Python 3: Python 2 doesn't have
handle_accepted(), and since the default implementation of
handle_accept() doesn't nothing, the handler is called in a loop,
because the socket is effectively always ready for accept.
msg181380 - (view) Author: Benjamin Ash (Benjamin.Ash) Date: 2013-02-04 19:32
Ok, thanks for quick followup. I didn't realize that the patch for Python-3, sorry about that.

The issue  I had was due to never calling self.listen() in the constructor of my server (asyncore.dispatcher). If this is not done the CPU spikes to 100%.

Thanks again.
History
Date User Action Args
2013-02-04 19:32:58Benjamin.Ashsetmessages: + msg181380
2013-02-04 18:28:33neologixsetmessages: + msg181364
2013-02-04 16:31:26Benjamin.Ashsetmessages: + msg181346
2013-02-04 16:08:24Benjamin.Ashsetmessages: + msg181340
2013-02-03 11:15:51neologixsetmessages: + msg181260
2013-02-01 16:35:40Benjamin.Ashsetnosy: + Benjamin.Ash
messages: + msg181087
2011-07-15 07:30:55hayposetnosy: + haypo
messages: + msg140390
2011-07-15 07:30:12hayposetmessages: - msg140388
2011-07-15 07:14:44giampaolo.rodolasetversions: - Python 3.1
2011-07-15 07:14:38giampaolo.rodolasetversions: + Python 3.1, Python 2.7, Python 3.3
2011-07-15 06:13:45neologixsetstatus: open -> closed
resolution: fixed
messages: + msg140388

stage: resolved
2011-07-14 18:00:39python-devsetmessages: + msg140350
2011-07-14 17:57:40python-devsetmessages: + msg140347
2011-07-14 17:54:08python-devsetmessages: + msg140346
2011-07-14 17:49:17python-devsetnosy: + python-dev
messages: + msg140345
2011-07-08 07:26:14giampaolo.rodolasetmessages: + msg140018
2011-07-08 07:10:24neologixsetmessages: + msg140016
2011-07-05 19:31:47rosslagerwallsetnosy: + rosslagerwall
messages: + msg139900
2011-07-05 19:18:54neologixsetfiles: + asyncore_unix_socket.diff

nosy: + neologix
messages: + msg139899

keywords: + patch
2011-07-05 17:28:06pitrousetnosy: + giampaolo.rodola
2011-07-05 17:25:37Alexey.Agapitovcreate