classification
Title: Converting ipv6 address to string representation using getnameinfo() is wrong.
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: gregory.p.smith, neologix, pitrou, socketpair, yselivanov
Priority: normal Keywords: patch

Created on 2017-12-05 09:54 by socketpair, last changed 2018-02-12 20:12 by yselivanov. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 4724 merged socketpair, 2017-12-05 09:55
PR 5394 merged yselivanov, 2018-01-28 20:20
PR 5449 merged socketpair, 2018-01-30 15:04
PR 5641 merged miss-islington, 2018-02-12 19:48
Messages (19)
msg307651 - (view) Author: Марк Коренберг (socketpair) * Date: 2017-12-05 09:54
https://github.com/python/cpython/pull/4724


`recvfrom` from multicast socket is painfull slow. In fact, it returns sender address in form:

`('fe80::941f:f6ff:fe04:c560%qwe', 42133, 0, 78)`
which is superfluous, since interface-name part (`%qwe`) is not actually used. Actually, scopeid (`78`) signify interface/scope/zone_id. This tuple can be used for `.sendto()` either with this interface-name-part or without.

The problem is in the performance. For each `recvrfom()`, `getnameinfo()` internally converts interface index to interface name using three syscalls, i.e. `socket(), getsockopt()?, close()` , which slows down receiving (I have not measured result, but see additional syscalls in `strace`).

In order to convert from tuple to string-based full address one may use `getnameinfo()`:
As you can see, initial interface is ignored (but without my patch it is also validated uselessly):
```
In[1]: socket.getnameinfo(('fe80::941f:f6ff:fe04:c560%qwe', 42133, 0, 78), socket.NI_NUMERICHOST)
Out[1]: ('fe80::941f:f6ff:fe04:c560%qwe', '42133')
In[2]: socket.getnameinfo(('fe80::941f:f6ff:fe04:c560', 42133, 0, 78), socket.NI_NUMERICHOST)
Out[2]: ('fe80::941f:f6ff:fe04:c560%qwe', '42133')
```
msg308317 - (view) Author: Марк Коренберг (socketpair) * Date: 2017-12-14 16:42
Please look at my PR
msg308554 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-12-18 13:16
> `recvfrom` from multicast socket is painfull slow.

What do you mean? Can you give results for a benchmark of your choice?  How much does your PR speed it up?
msg308725 - (view) Author: Марк Коренберг (socketpair) * Date: 2017-12-20 11:34
Original (not patched) python:

```
In [1]s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
In [2]: s.bind(('ff02::1de:c0db', 1234, 0, 2))
In [3]: timeit s.getsockname()
The slowest run took 12.06 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.8 µs per loop

In [5]: d = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
In [6]: d.bind(('::', 1235, 0, 0))
In [7]: timeit d.getsockname()
The slowest run took 23.18 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 532 ns per loop
```

On patched version, times for both cases should be the same exactly (i.e. 532 ns)
msg308727 - (view) Author: Марк Коренберг (socketpair) * Date: 2017-12-20 11:44
So it may affect applications, that work with UDP and .recvfrom()
msg308729 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-12-20 12:41
Wow.  That's a heavy price to pay indeed.

Unfortunately I'm not competent on IPv6.  I'm cc-ing other core developers so that they may chime in.
msg308821 - (view) Author: Марк Коренберг (socketpair) * Date: 2017-12-20 21:20
After my patch:

$ python3.7 -m timeit --setup 'import socket; s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM); s.bind(("ff02::1de:c0db", 1234, 0, 2))' 's.getsockname()'
500000 loops, best of 5: 613 nsec per loop

$ python3.7 -m timeit --setup 'import socket; s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM); s.bind(("::", 1235, 0, 0))' 's.getsockname()'
500000 loops, best of 5: 420 nsec per loop
msg308825 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-12-20 21:32
I thought the improvements were way bigger than 200nsec, no?
msg308839 - (view) Author: Марк Коренберг (socketpair) * Date: 2017-12-21 02:53
No, Improvement for typical case (not scoped IPv6 addresses) was not the target, but fortunatelly it also sped up.
msg308842 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-12-21 03:12
The PR looks good, but I don't have the capacity right now to merge it. Ill need some time for research to double check if it's ok to drop scope id. Will try to find it in the next few days.
msg309190 - (view) Author: Марк Коренберг (socketpair) * Date: 2017-12-29 19:45
So, PR is ready. Please review.
msg309191 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-12-29 19:52
> So, PR is ready. Please review.

I understand, however I still don't have time for it at the moment. If another core dev wants to take initiative, please by all means do.

I can/will take another look at this after January 15.
msg310894 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-27 22:20
New changeset 47c0b1f7d4115e6f15e6776c1f91d28e7d96fe0c by Yury Selivanov (Коренберг Марк) in branch 'master':
bpo-32221: makeipaddr(): remove interface part + speedup (GH-4724)
https://github.com/python/cpython/commit/47c0b1f7d4115e6f15e6776c1f91d28e7d96fe0c
msg310895 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-27 22:21
Merged.  Thanks, Mark!
msg310987 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-28 20:18
Mark,

Two tests fail on MacOS X now (newly added tests).  I'm reverting the change for now.  If you figure this out, please re-submit a PR for a quick approval.


======================================================================
ERROR: test_getaddrinfo_ipv6_scopeid (test.test_socket.GeneralModuleTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/yury/dev/pydev/cpython/Lib/test/test_socket.py", line 1612, in test_getaddrinfo_ipv6_scopeid
    ifindex = socket.if_nametoindex(test_interface)
OSError: no interface with this name

======================================================================
ERROR: test_getnameinfo_ipv6_scopeid (test.test_socket.GeneralModuleTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/yury/dev/pydev/cpython/Lib/test/test_socket.py", line 1636, in test_getnameinfo_ipv6_scopeid
    nameinfo = socket.getnameinfo(sockaddr, socket.NI_NUMERICHOST | socket.NI_NUMERICSERV)
socket.gaierror: [Errno 4] Non-recoverable failure in name resolution

----------------------------------------------------------------------
msg310993 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-28 21:08
New changeset 0ceb717689b04c0540d78c1ba93c0572c66c0994 by Yury Selivanov in branch 'master':
Revert "bpo-32221: makeipaddr(): remove interface part + speedup (GH-4724)" (#5394)
https://github.com/python/cpython/commit/0ceb717689b04c0540d78c1ba93c0572c66c0994
msg311006 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-28 21:45
Mark, I think this still land in beta-2.  Please try to figure out what's going on on Mac OS (I have no time to take care of this myself).
msg312071 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-02-12 19:47
New changeset 7766b96ab80b04509bbac708ee5ecf3c1c5934fc by Yury Selivanov (Коренберг Марк) in branch 'master':
bpo-32221: makeipaddr(): remove interface part + speedup (GH-5449) (#5449)
https://github.com/python/cpython/commit/7766b96ab80b04509bbac708ee5ecf3c1c5934fc
msg312074 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-02-12 20:12
New changeset 0442599961f966a3dc7f3fe6a3c0d5765fcf2082 by Yury Selivanov (Miss Islington (bot)) in branch '3.7':
bpo-32221: makeipaddr(): remove interface part + speedup (GH-5449) (GH-5449) (#5641)
https://github.com/python/cpython/commit/0442599961f966a3dc7f3fe6a3c0d5765fcf2082
History
Date User Action Args
2018-02-12 20:12:56yselivanovsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018-02-12 20:12:26yselivanovsetmessages: + msg312074
2018-02-12 19:48:07miss-islingtonsetpull_requests: + pull_request5441
2018-02-12 19:47:50yselivanovsetmessages: + msg312071
2018-01-30 15:04:42socketpairsetpull_requests: + pull_request5281
2018-01-28 21:45:10yselivanovsetmessages: + msg311006
2018-01-28 21:08:34yselivanovsetmessages: + msg310993
2018-01-28 20:20:00yselivanovsetstage: patch review
pull_requests: + pull_request5229
2018-01-28 20:18:32yselivanovsetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg310987

stage: resolved -> (no value)
2018-01-27 22:21:37yselivanovsetstatus: open -> closed
resolution: fixed
messages: + msg310895

stage: patch review -> resolved
2018-01-27 22:20:53yselivanovsetmessages: + msg310894
2017-12-29 19:52:35yselivanovsetmessages: + msg309191
2017-12-29 19:45:57socketpairsetmessages: + msg309190
2017-12-21 03:12:36yselivanovsetmessages: + msg308842
2017-12-21 02:53:17socketpairsetmessages: + msg308839
2017-12-20 21:32:07yselivanovsetmessages: + msg308825
2017-12-20 21:20:48socketpairsetmessages: + msg308821
2017-12-20 12:41:00pitrousetnosy: + gregory.p.smith, neologix, yselivanov
messages: + msg308729
2017-12-20 11:44:11socketpairsetmessages: + msg308727
2017-12-20 11:34:47socketpairsetmessages: + msg308725
2017-12-18 13:16:15pitrousetnosy: + pitrou
messages: + msg308554
2017-12-18 13:15:27pitrousetversions: - Python 3.6, Python 3.8
2017-12-14 16:42:03socketpairsetmessages: + msg308317
2017-12-05 09:55:04socketpairsetkeywords: + patch
stage: patch review
pull_requests: + pull_request4631
2017-12-05 09:54:54socketpaircreate