Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket.create_connection() doesn't handle EINTR properly #64810

Closed
florentx mannequin opened this issue Feb 12, 2014 · 20 comments
Closed

socket.create_connection() doesn't handle EINTR properly #64810

florentx mannequin opened this issue Feb 12, 2014 · 20 comments
Assignees
Labels
stdlib Python modules in the Lib dir topic-IO type-bug An unexpected behavior, bug, or error

Comments

@florentx
Copy link
Mannequin

florentx mannequin commented Feb 12, 2014

BPO 20611
Nosy @gpshead, @pitrou, @vstinner, @florentx, @vadmium
Superseder
  • bpo-23863: Fix EINTR Socket Module issues in 2.7
  • Files
  • httplib_2.6.4_eintr_patch.py: Python 2.6.4 httplib EINTR patch
  • httplib_2.7.3_eintr_patch.py: Python 2.7.3 httplib EINTR patch
  • socketmodule_2.7.6_eintr_patch.c: Python 2.7.6 socketmodule EINTR patch
  • socket_2.7.3_eintr_patch.py: python 2.7.3 socket.py EINTR patch
  • socket_2_7_2_patch.py: python 2.7.2 socket.py EINTR patch
  • issue20611-connect-eintr-gps01.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = <Date 2015-04-07.22:13:49.141>
    created_at = <Date 2014-02-12.15:13:52.026>
    labels = ['type-bug', 'library', 'expert-IO']
    title = "socket.create_connection() doesn't handle EINTR properly"
    updated_at = <Date 2015-04-07.22:13:49.140>
    user = 'https://github.com/florentx'

    bugs.python.org fields:

    activity = <Date 2015-04-07.22:13:49.140>
    actor = 'gregory.p.smith'
    assignee = 'gregory.p.smith'
    closed = True
    closed_date = <Date 2015-04-07.22:13:49.141>
    closer = 'gregory.p.smith'
    components = ['Library (Lib)', 'IO']
    creation = <Date 2014-02-12.15:13:52.026>
    creator = 'flox'
    dependencies = []
    files = ['35296', '35297', '35361', '35406', '35408', '35465']
    hgrepos = []
    issue_num = 20611
    keywords = ['patch']
    message_count = 20.0
    messages = ['211095', '211128', '218831', '218833', '218836', '218841', '219121', '219122', '219337', '219340', '219343', '219344', '219371', '219375', '219376', '219668', '223890', '235544', '240194', '240232']
    nosy_count = 8.0
    nosy_names = ['gregory.p.smith', 'pitrou', 'vstinner', 'flox', 'tholzer', 'neologix', 'martin.panter', 'meishao']
    pr_nums = []
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'test needed'
    status = 'closed'
    superseder = '23863'
    type = 'behavior'
    url = 'https://bugs.python.org/issue20611'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @florentx
    Copy link
    Mannequin Author

    florentx mannequin commented Feb 12, 2014

    I had this sporadic traceback in a project:

    File "test.py", line 62, in <module>
    result = do_lqs(client, str(dnvn))
    File "test.py", line 25, in do_lqs
    qualif_service_id = client.create('ti.qualif.service', {})
    File "/srv/openerp/.buildout/eggs/ERPpeek-1.4.5-py2.6.egg/erppeek.py", line 894, in wrapper
    return self.execute(obj, method, *params, **kwargs)
    File "/srv/openerp/.buildout/eggs/ERPpeek-1.4.5-py2.6.egg/erppeek.py", line 636, in execute
    res = self._execute(obj, method, *params)
    File "/srv/openerp/.buildout/eggs/ERPpeek-1.4.5-py2.6.egg/erppeek.py", line 361, in <lambda>
    wrapper = lambda s, *args: s._dispatch(name, args)
    File "/usr/lib/python2.6/xmlrpclib.py", line 1489, in __request
    verbose=self.__verbose
    File "/usr/lib/python2.6/xmlrpclib.py", line 1235, in request
    self.send_content(h, request_body)
    File "/usr/lib/python2.6/xmlrpclib.py", line 1349, in send_content
    connection.endheaders()
    File "/usr/lib/python2.6/httplib.py", line 908, in endheaders
    self._send_output()
    File "/usr/lib/python2.6/httplib.py", line 780, in _send_output
    self.send(msg)
    File "/usr/lib/python2.6/httplib.py", line 739, in send
    self.connect()
    File "/usr/lib/python2.6/httplib.py", line 1112, in connect
    sock = socket.create_connection((self.host, self.port), self.timeout)
    File "/usr/lib/python2.6/socket.py", line 561, in create_connection
    raise error, msg
    socket.error: [Errno 4] Interrupted system call

    It seems that the EINTR should be caught by the standard library in all cases: http://bugs.python.org/issue1628205

    But it's not the case for the "socket.create_connection" method (verified in 3.3 and 2.7 source code).

    @florentx florentx mannequin added stdlib Python modules in the Lib dir topic-IO type-bug An unexpected behavior, bug, or error labels Feb 12, 2014
    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Feb 13, 2014

    It seems that the EINTR should be caught by the standard library in all
    cases:
    http://bugs.python.org/issue1628205

    Yes, it should.

    But it's not the case for the "socket.create_connection" method (verified
    in 3.3 and 2.7 source code).

    It's not the case for *almost all* syscalls.
    See http://bugs.python.org/issue18885

    @tholzer
    Copy link
    Mannequin

    tholzer mannequin commented May 20, 2014

    We encountered the same problem, this is in the context of using PyQt (specifically QProcess) or twisted. They both rely on SIGCHLD for their notification framework.

    I've attached a httplib EINTR patch for 2.6.4 & 2.7.3.

    @tholzer
    Copy link
    Mannequin

    tholzer mannequin commented May 20, 2014

    Here is a reproducible test case:

    import threading
    import signal
    import os
    import httplib
    
    def killer():
        while 1:
            os.kill(os.getpid(), signal.SIGINT)
    
    def go():
        signal.signal(signal.SIGINT, lambda x,y: None)
        thread = threading.Thread(target=killer)
        thread.start()
        while 1:
            connection = httplib.HTTPConnection("localhost:80")
            connection.connect()
            connection.close()
            
    if __name__ == '__main__':
        go()

    Which gives:

    Traceback (most recent call last):
      File "./repro1.py", line 22, in <module>
        go()
      File "./repro1.py", line 18, in go
        connection.connect()
      File ".../lib/python2.7/httplib.py", line 757, in connect
        self.timeout, self.source_address)
      File ".../lib/python2.7/socket.py", line 571, in create_connection
        raise err
    socket.error:

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented May 20, 2014

    As said in a previous comment, we don't want to have EINTR handling
    code everywhere.
    The right way to do this is to handle it at the syscall level.

    @pitrou
    Copy link
    Member

    pitrou commented May 20, 2014

    @tholzer, to clarify Charles-François's comment: EINTR should be handled in socket.connect() and socket.getaddrinfo() (the two syscalls called by create_connection(), AFAIR).

    @tholzer
    Copy link
    Mannequin

    tholzer mannequin commented May 26, 2014

    No problem, I've attached a patch for socket.py for Python 2.7.3.

    A few notes:

    getaddrinfo (and gethostbyname, etc.) are already immune to this bug, so I've just fixed the connect() call.

    The socket does need to be closed after EINTR, otherwise a EINPROGRESS might get returned on subsequent connect() calls.

    @tholzer
    Copy link
    Mannequin

    tholzer mannequin commented May 26, 2014

    I've also attached a potential patch for the C module Modules/socketmodule.c inside internal_connect().

    A few notes:

    This seems to work both without time-out and with time-out sockets (non-blocking).

    One concern would be a signal storm prolonging the operation beyond the time-out. Should we keep track of the actual time taken in this loop and check it against the 'timeout' parameter ?

    Also, I don't think we can call PyErr_CheckSignals() in this context. Does this need to happen at all ?

    @meishao
    Copy link
    Mannequin

    meishao mannequin commented May 29, 2014

    http://bugs.python.org/issue21602

    The smtplib.py also has the same problem.
    The EINTR cannot be handled properly.

    @meishao
    Copy link
    Mannequin

    meishao mannequin commented May 29, 2014

    @neologix,
    May I attach the patch file of smtplib.py for review?

    @florentx
    Copy link
    Mannequin Author

    florentx mannequin commented May 29, 2014

    @meishao
    Copy link
    Mannequin

    meishao mannequin commented May 29, 2014

    @flox

    Thank you for your comment.
    So we just only modify the socket.py to handle the system level call, is it right?
    Please let me attach the patch file of socket.py for 2.7.2.

    @tholzer
    Copy link
    Mannequin

    tholzer mannequin commented May 29, 2014

    Oops, I missed a break statement at the end of socket_2.7.3_eintr_patch.py. I've fixed this now in the attached patch.

    @meishao

    Could you please also update your socket_2_7_2_patch.py and add the missing break statement ?

    @meishao
    Copy link
    Mannequin

    meishao mannequin commented May 30, 2014

    @tholzer

    I've updated socket_2_7_2_patch.py and added the missing break statement.

    @tholzer
    Copy link
    Mannequin

    tholzer mannequin commented May 30, 2014

    And a test case for smtplib:

    import threading
    import signal
    import os
    import smtplib
    
    def go():
        running = True
        pid = os.getpid()
    
        def killer():
            while running: 
                os.kill(pid, signal.SIGINT)
    
        signal.signal(signal.SIGINT, lambda x,y: None)
        thread = threading.Thread(target=killer)
        thread.start()
    
        while 1:
            try:
                smtplib.SMTP('localhost')
            except Exception, ex:
                running = False
                raise
            
    if __name__ == '__main__':
        go()

    Fails with:

    socket.error: [Errno 4] Interrupted system call

    @gpshead
    Copy link
    Member

    gpshead commented Jun 3, 2014

    Something like the patch i'm attaching to socketmodule.c is what I would prefer. I haven't looked at or tried tests for it yet.

    @gpshead gpshead self-assigned this Jun 3, 2014
    @vstinner
    Copy link
    Member

    The issue is just an example of the main issue bpo-18885 which proposes to retry interrupted syscalls. I hesitate to mark it as duplicate, but it contains an interesting patch.

    @vadmium
    Copy link
    Member

    vadmium commented Feb 8, 2015

    See also PEP-475 and bpo-23285 for the general fix in Python 3

    @vstinner
    Copy link
    Member

    vstinner commented Apr 6, 2015

    This issue was fixed in Python 3.5 as part of the PEP-475: see issue bpo-23618 which modified socket.socket.connect() to handle EINTR. It's not as simple as retrying connect() in a loop. You must respect the socket timeout (it's better to have a monotonic clock here, it's now always the case in Python 3.5). When connect() returns EINTR, the connection runs asynchronously, you have to call select() to wait until the connection completes or fails. Then you have to call getsockopt() to get the socket error code.

    In Python 3.5, socket.socket.connect() still raises InterruptedError if the socket is non-blocking:
    https://docs.python.org/dev/library/socket.html#socket.socket.connect

    bpo-20611-connect-eintr-gps01.diff calls again connect() if connect() failed with EINTR. According to the issue bpo-23618, it might work on some platforms, but it's not portable.

    For Python 2.7 and 3.4, instead of fixing socket.socket.connect(), which requires complex code, we may only workaround the issue in create_connection(). If connect() raises OSError(EINTR), drop the socket and retry with a fresh connection in a loop until the connection completes or raises a different exception. (And do that for each address.)

    @gpshead
    Copy link
    Member

    gpshead commented Apr 7, 2015

    i'm moving this to the more recent issue as i like the patch in that one better.

    @gpshead gpshead closed this as completed Apr 7, 2015
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir topic-IO type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants