Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting ipv6 address to string representation using getnameinfo() is wrong. #76402

Closed
socketpair mannequin opened this issue Dec 5, 2017 · 19 comments
Closed

Converting ipv6 address to string representation using getnameinfo() is wrong. #76402

socketpair mannequin opened this issue Dec 5, 2017 · 19 comments
Labels
3.7 (EOL) end of life performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@socketpair
Copy link
Mannequin

socketpair mannequin commented Dec 5, 2017

BPO 32221
Nosy @gpshead, @pitrou, @socketpair, @1st1
PRs
  • bpo-32221: makeipaddr: remove interface part + speedup #4724
  • Revert "bpo-32221: makeipaddr(): remove interface part [GH-4724] #5394
  • bpo-32221: makeipaddr(): remove interface part + speedup (GH-5449) #5449
  • [3.7] bpo-32221: makeipaddr(): remove interface part + speedup (GH-5449) (GH-5449) #5641
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-02-12.20:12:56.838>
    created_at = <Date 2017-12-05.09:54:54.420>
    labels = ['3.7', 'library', 'performance']
    title = 'Converting ipv6 address to string representation using getnameinfo() is wrong.'
    updated_at = <Date 2018-02-12.20:12:56.838>
    user = 'https://github.com/socketpair'

    bugs.python.org fields:

    activity = <Date 2018-02-12.20:12:56.838>
    actor = 'yselivanov'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-02-12.20:12:56.838>
    closer = 'yselivanov'
    components = ['Library (Lib)']
    creation = <Date 2017-12-05.09:54:54.420>
    creator = 'socketpair'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 32221
    keywords = ['patch']
    message_count = 19.0
    messages = ['307651', '308317', '308554', '308725', '308727', '308729', '308821', '308825', '308839', '308842', '309190', '309191', '310894', '310895', '310987', '310993', '311006', '312071', '312074']
    nosy_count = 5.0
    nosy_names = ['gregory.p.smith', 'pitrou', 'neologix', 'socketpair', 'yselivanov']
    pr_nums = ['4724', '5394', '5449', '5641']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue32221'
    versions = ['Python 3.7']

    @socketpair
    Copy link
    Mannequin Author

    socketpair mannequin commented Dec 5, 2017

    #4724

    recvfrom from multicast socket is painfull slow. In fact, it returns sender address in form:

    ('fe80::941f:f6ff:fe04:c560%qwe', 42133, 0, 78)
    which is superfluous, since interface-name part (%qwe) is not actually used. Actually, scopeid (78) signify interface/scope/zone_id. This tuple can be used for .sendto() either with this interface-name-part or without.

    The problem is in the performance. For each recvrfom(), getnameinfo() internally converts interface index to interface name using three syscalls, i.e. socket(), getsockopt()?, close() , which slows down receiving (I have not measured result, but see additional syscalls in strace).

    In order to convert from tuple to string-based full address one may use getnameinfo():
    As you can see, initial interface is ignored (but without my patch it is also validated uselessly):

    In[1]: socket.getnameinfo(('fe80::941f:f6ff:fe04:c560%qwe', 42133, 0, 78), socket.NI_NUMERICHOST)
    Out[1]: ('fe80::941f:f6ff:fe04:c560%qwe', '42133')
    In[2]: socket.getnameinfo(('fe80::941f:f6ff:fe04:c560', 42133, 0, 78), socket.NI_NUMERICHOST)
    Out[2]: ('fe80::941f:f6ff:fe04:c560%qwe', '42133')
    

    @socketpair socketpair mannequin added 3.8 only security fixes 3.7 (EOL) end of life stdlib Python modules in the Lib dir performance Performance or resource usage labels Dec 5, 2017
    @socketpair
    Copy link
    Mannequin Author

    socketpair mannequin commented Dec 14, 2017

    Please look at my PR

    @pitrou pitrou removed the 3.8 only security fixes label Dec 18, 2017
    @pitrou
    Copy link
    Member

    pitrou commented Dec 18, 2017

    recvfrom from multicast socket is painfull slow.

    What do you mean? Can you give results for a benchmark of your choice? How much does your PR speed it up?

    @socketpair
    Copy link
    Mannequin Author

    socketpair mannequin commented Dec 20, 2017

    Original (not patched) python:

    In [1]s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
    In [2]: s.bind(('ff02::1de:c0db', 1234, 0, 2))
    In [3]: timeit s.getsockname()
    The slowest run took 12.06 times longer than the fastest. This could mean that an intermediate result is being cached.
    100000 loops, best of 3: 3.8 µs per loop
    
    In [5]: d = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
    In [6]: d.bind(('::', 1235, 0, 0))
    In [7]: timeit d.getsockname()
    The slowest run took 23.18 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000000 loops, best of 3: 532 ns per loop
    

    On patched version, times for both cases should be the same exactly (i.e. 532 ns)

    @socketpair
    Copy link
    Mannequin Author

    socketpair mannequin commented Dec 20, 2017

    So it may affect applications, that work with UDP and .recvfrom()

    @pitrou
    Copy link
    Member

    pitrou commented Dec 20, 2017

    Wow. That's a heavy price to pay indeed.

    Unfortunately I'm not competent on IPv6. I'm cc-ing other core developers so that they may chime in.

    @socketpair
    Copy link
    Mannequin Author

    socketpair mannequin commented Dec 20, 2017

    After my patch:

    $ python3.7 -m timeit --setup 'import socket; s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM); s.bind(("ff02::1de:c0db", 1234, 0, 2))' 's.getsockname()'
    500000 loops, best of 5: 613 nsec per loop
    
    $ python3.7 -m timeit --setup 'import socket; s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM); s.bind(("::", 1235, 0, 0))' 's.getsockname()'
    500000 loops, best of 5: 420 nsec per loop

    @1st1
    Copy link
    Member

    1st1 commented Dec 20, 2017

    I thought the improvements were way bigger than 200nsec, no?

    @socketpair
    Copy link
    Mannequin Author

    socketpair mannequin commented Dec 21, 2017

    No, Improvement for typical case (not scoped IPv6 addresses) was not the target, but fortunatelly it also sped up.

    @1st1
    Copy link
    Member

    1st1 commented Dec 21, 2017

    The PR looks good, but I don't have the capacity right now to merge it. Ill need some time for research to double check if it's ok to drop scope id. Will try to find it in the next few days.

    @socketpair
    Copy link
    Mannequin Author

    socketpair mannequin commented Dec 29, 2017

    So, PR is ready. Please review.

    @1st1
    Copy link
    Member

    1st1 commented Dec 29, 2017

    So, PR is ready. Please review.

    I understand, however I still don't have time for it at the moment. If another core dev wants to take initiative, please by all means do.

    I can/will take another look at this after January 15.

    @1st1
    Copy link
    Member

    1st1 commented Jan 27, 2018

    New changeset 47c0b1f by Yury Selivanov (Коренберг Марк) in branch 'master':
    bpo-32221: makeipaddr(): remove interface part + speedup (GH-4724)
    47c0b1f

    @1st1
    Copy link
    Member

    1st1 commented Jan 27, 2018

    Merged. Thanks, Mark!

    @1st1 1st1 closed this as completed Jan 27, 2018
    @1st1
    Copy link
    Member

    1st1 commented Jan 28, 2018

    Mark,

    Two tests fail on MacOS X now (newly added tests). I'm reverting the change for now. If you figure this out, please re-submit a PR for a quick approval.

    ======================================================================
    ERROR: test_getaddrinfo_ipv6_scopeid (test.test_socket.GeneralModuleTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/Users/yury/dev/pydev/cpython/Lib/test/test_socket.py", line 1612, in test_getaddrinfo_ipv6_scopeid
        ifindex = socket.if_nametoindex(test_interface)
    OSError: no interface with this name

    ======================================================================
    ERROR: test_getnameinfo_ipv6_scopeid (test.test_socket.GeneralModuleTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/Users/yury/dev/pydev/cpython/Lib/test/test_socket.py", line 1636, in test_getnameinfo_ipv6_scopeid
        nameinfo = socket.getnameinfo(sockaddr, socket.NI_NUMERICHOST | socket.NI_NUMERICSERV)
    socket.gaierror: [Errno 4] Non-recoverable failure in name resolution

    @1st1 1st1 reopened this Jan 28, 2018
    @1st1
    Copy link
    Member

    1st1 commented Jan 28, 2018

    New changeset 0ceb717 by Yury Selivanov in branch 'master':
    Revert "bpo-32221: makeipaddr(): remove interface part + speedup (GH-4724)" (bpo-5394)
    0ceb717

    @1st1
    Copy link
    Member

    1st1 commented Jan 28, 2018

    Mark, I think this still land in beta-2. Please try to figure out what's going on on Mac OS (I have no time to take care of this myself).

    @1st1
    Copy link
    Member

    1st1 commented Feb 12, 2018

    New changeset 7766b96 by Yury Selivanov (Коренберг Марк) in branch 'master':
    bpo-32221: makeipaddr(): remove interface part + speedup (GH-5449) (bpo-5449)
    7766b96

    @1st1
    Copy link
    Member

    1st1 commented Feb 12, 2018

    New changeset 0442599 by Yury Selivanov (Miss Islington (bot)) in branch '3.7':
    bpo-32221: makeipaddr(): remove interface part + speedup (GH-5449) (GH-5449) (bpo-5641)
    0442599

    @1st1 1st1 closed this as completed Feb 12, 2018
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants