Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyncio.base_events.create_connection doesn't handle scoped IPv6 addresses #79726

Closed
maxifree mannequin opened this issue Dec 20, 2018 · 29 comments
Closed

asyncio.base_events.create_connection doesn't handle scoped IPv6 addresses #79726

maxifree mannequin opened this issue Dec 20, 2018 · 29 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@maxifree
Copy link
Mannequin

maxifree mannequin commented Dec 20, 2018

BPO 35545
Nosy @asvetlov, @1st1, @aixtools, @lepaperwan, @eamanu, @twisteroidambassador, @miss-islington, @maxifree
PRs
  • bpo-35545: Fix asyncio discarding IPv6 scopes  #11271
  • bpo-35545: Fix mishandling of scoped IPv6 addresses #11403
  • bpo-35545: Fix mishandling of scoped IPv6 addresses #11403
  • bpo-35545: Fix mishandling of scoped IPv6 addresses #11403
  • bpo-35545: Fix mishandling of scoped IPv6 addresses #11403
  • [3.7] bpo-35545: Fix asyncio discarding IPv6 scopes (GH-11271) #13379
  • bpo-35545: Skip test_asyncio.test_create_connection_ipv6_scope on AIX #14011
  • [3.8] bpo-35545: Skip test_asyncio.test_create_connection_ipv6_scope on AIX (GH-14011) #14012
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-05-26.10:22:37.451>
    created_at = <Date 2018-12-20.10:56:50.251>
    labels = ['3.8', 'type-bug', '3.7', 'expert-asyncio']
    title = "asyncio.base_events.create_connection doesn't handle scoped IPv6 addresses"
    updated_at = <Date 2020-05-26.10:22:37.450>
    user = 'https://github.com/maxifree'

    bugs.python.org fields:

    activity = <Date 2020-05-26.10:22:37.450>
    actor = 'cheryl.sabella'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-05-26.10:22:37.451>
    closer = 'cheryl.sabella'
    components = ['asyncio']
    creation = <Date 2018-12-20.10:56:50.251>
    creator = 'maxifree'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 35545
    keywords = ['patch']
    message_count = 29.0
    messages = ['332211', '332275', '332287', '332288', '332871', '332962', '332963', '332964', '342694', '342696', '343158', '343167', '343208', '343214', '343351', '343417', '343429', '343430', '343438', '343464', '343884', '343957', '344002', '344003', '344043', '345329', '345331', '347599', '349053']
    nosy_count = 10.0
    nosy_names = ['sascha_silbe', 'asvetlov', 'yselivanov', 'Michael.Felt', 'lepaperwan', 'eamanu', 'twisteroid ambassador', 'miss-islington', 'maxifree', 'Zaar Hai']
    pr_nums = ['11271', '11403', '11403', '11403', '11403', '13379', '14011', '14012']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue35545'
    versions = ['Python 3.7', 'Python 3.8']

    @maxifree
    Copy link
    Mannequin Author

    maxifree mannequin commented Dec 20, 2018

    loop.create_connection doesn't handle ipv6 RFC4007 addresses right since 3.7

    TEST CASE
    # Set up listener on link-local address fe80::1%lo
    sudo ip a add dev lo fe80::1

    # 3.6 handles everything fine
    socat file:/dev/null tcp6-listen:12345,REUSEADDR &
    python3.6 -c 'import asyncio;loop=asyncio.get_event_loop();loop.run_until_complete(loop.create_connection(lambda:asyncio.Protocol(),host="fe80::1%lo",port="12345"))'

    # 3.7 and later fails
    socat file:/dev/null tcp6-listen:12345,REUSEADDR &
    python3.7 -c 'import asyncio;loop=asyncio.get_event_loop();loop.run_until_complete(loop.create_connection(lambda:asyncio.Protocol(),host="fe80::1%lo",port="12345"))'

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/usr/lib/python3.7/asyncio/base_events.py", line 576, in run_until_complete
        return future.result()
      File "/usr/lib/python3.7/asyncio/base_events.py", line 951, in create_connection
        raise exceptions[0]
      File "/usr/lib/python3.7/asyncio/base_events.py", line 938, in create_connection
        await self.sock_connect(sock, address)
      File "/usr/lib/python3.7/asyncio/selector_events.py", line 475, in sock_connect
        return await fut
      File "/usr/lib/python3.7/asyncio/selector_events.py", line 480, in _sock_connect
        sock.connect(address)
    OSError: [Errno 22] Invalid argument

    CAUSE

    Upon asyncio.base_events.create_connection _ensure_resolved is called twice, first time here:
    https://github.com/python/cpython/blob/3.7/Lib/asyncio/base_events.py#L908
    then here through sock_connect:
    https://github.com/python/cpython/blob/3.7/Lib/asyncio/base_events.py#L946
    https://github.com/python/cpython/blob/3.7/Lib/asyncio/selector_events.py#L458

    _ensure_resolved calls getaddrinfo, but in 3.7 implementation changed:

    % python3.6 -c 'import socket;print(socket.getaddrinfo("fe80::1%lo",12345)[0][4])'
    ('fe80::1%lo', 12345, 0, 1)

    % python3.7 -c 'import socket;print(socket.getaddrinfo("fe80::1%lo",12345)[0][4])'
    ('fe80::1', 12345, 0, 1)

    _ensure_connect only considers host and port parts of the address tuple:
    https://github.com/python/cpython/blob/3.7/Lib/asyncio/base_events.py#L1272

    In case of 3.7 first call to _ensure_resolved returns
    ('fe80::1', 12345, 0, 1)
    then second call returns
    ('fe80::1', 12345, 0, 0)
    Notice that scope is now completely lost and is set to 0, thus actual call to socket.connect is wrong

    In case of 3.6 both first and second call to _ensure_resolved return
    ('fe80::1%lo', 12345, 0, 1)
    because in 3.6 case scope info is preserved in address and second call can derive correct address tuple

    @maxifree maxifree mannequin added 3.7 (EOL) end of life 3.8 only security fixes topic-asyncio type-bug An unexpected behavior, bug, or error labels Dec 20, 2018
    @lepaperwan
    Copy link
    Mannequin

    lepaperwan mannequin commented Dec 20, 2018

    While the 3.7+ getaddrinfo isn't the best human representation of an IPv6 address, I believe it does make the most sense to keep it that way.
    In any case, this is a regression and changing return values of getaddrinfo for 3.7 isn't something that should be considered.

    The issue stems from the refactoring of the underlying socketmodule.c handling of IPv4/IPv6 addresses with dedicated make_ipv4_addr and make_ipv6_addr functions which returns proper tuples of:
    str/int for IPv4: https://github.com/python/cpython/blob/master/Modules/socketmodule.c#L1270
    str/int/int/int for IPv6: https://github.com/python/cpython/blob/master/Modules/socketmodule.c#L1325

    The actual issue is that _ensure_resolved naively assumes IPv4 and truncates the address to its first 2 members
    https://github.com/python/cpython/blob/3.7/Lib/asyncio/base_events.py#L1269
    and never redefines them again in the case where they were set.

    I'd suggest passing the remaining elements of address in a packed *args or an optional flowinfo=0, scopeid=0 to _ipaddr_info since fundamentally, that's the method stripping valuable information.

    @twisteroidambassador
    Copy link
    Mannequin

    twisteroidambassador mannequin commented Dec 21, 2018

    I think the root cause of this bug is a bit of confusion.

    The "customer-facing" asyncio API, create_connection(), takes two arguments: host and port. The lower-level API that actually deal with connecting sockets, socket.connect() and loop.sock_connect(), takes one argument: the address tuple. These are *not the same thing*, despite an IPv4 address tuple having two elements (host, port), and must not be mixed.

    _ensure_resolved() is the function responsible for turning host + port into an address tuple, and it does the right thing, turning host="fe80::1%lo",port=12345 into ('fe80::1', 12345, 0, 1) correctly. The mistake is taking the address tuple and passing it through _ensure_resolved() again, since that's not the correct input type for it: the only correct input type is host + port.

    So I think the appropriate fix for this bug is to make sure _ensure_resolved is only called once. In particular, BaseSelectorEventLoop.sock_connect()

    async def sock_connect(self, sock, address):
    should not call _ensure_resolved(). It might be a good idea to add some comments clarifying that sock_connect() takes an address tuple argument, not host + port, and likewise for sock_connect() on each event loop implementation.

    @twisteroidambassador
    Copy link
    Mannequin

    twisteroidambassador mannequin commented Dec 21, 2018

    Also I believe it's a good idea to change the arguments of _ensure_resolved() from (address, *, ...) to (host, port, *, ...), and go through all its usages, making sure we're not mixing host + port with address tuples everywhere in asyncio.

    @eamanu
    Copy link
    Mannequin

    eamanu mannequin commented Jan 2, 2019

    Hi!,

    I was reading the PR. Just a little comment. I am not sure about have a difference for IPv4 and IPv6, in the sense of use a tuple for IPv4 and separate parameters for IPv6

    Regards

    @twisteroidambassador
    Copy link
    Mannequin

    twisteroidambassador mannequin commented Jan 4, 2019

    Hi Emmanuel,

    Are you referring to my PR 11403? I don't see where IPv6 uses separate parameters.

    @twisteroidambassador
    Copy link
    Mannequin

    twisteroidambassador mannequin commented Jan 4, 2019

    I just noticed that in the socket module, an AF_INET address tuple is allowed to have an unresolved host name. Quote:

    A pair (host, port) is used for the AF_INET address family, where host is a string representing either a hostname in Internet domain notation like 'daring.cwi.nl' or an IPv4 address like '100.50.200.5', and port is an integer.

    https://docs.python.org/3/library/socket.html#socket-families

    Passing a tuple of (hostname, port) to socket.connect() successfully connects the socket (tested on Windows). Since the C struct sockaddr_in does not support hostnames, socket.connect obviously does resolution at some point, but its implementation is in C, and I haven't looked into it.

    BaseSelectorEventLoop.sock_connect() calls socket.connect() directly, therefore it also supports passing in a tuple of (hostname, port). I just tested ProactorEventLoop.sock_connect() on 3.7.1 on Windows, and it does not support hostnames, raising OSError: [WinError 10022] An invalid argument was supplied.

    I personally believe it's not a good idea to allow hostnames in address tuples and in sock.connect(). However, the socket module tries pretty hard to basically accept any (host, port) tuple as address tuples, whether host is an IPv4 address, IPv6 address or host name, so that's probably not going to change.

    @twisteroidambassador
    Copy link
    Mannequin

    twisteroidambassador mannequin commented Jan 4, 2019

    Oh wait, there's also this in asyncio docs for loop.sock_connect:

    Changed in version 3.5.2: address no longer needs to be resolved. sock_connect will try to check if the address is already resolved by calling socket.inet_pton(). If not, loop.getaddrinfo() will be used to resolve the address.

    https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.sock_connect

    So this is where the current bug comes from! My PR 11403 basically undid this change.

    My proposal, as is probably obvious, is to undo this change and insist on passing only resolved address tuples to loop.sock_connect(). My argument is that this feature never worked properly:

    • As mentioned in the previous message, this does not work on ProactorEventLoop.
    • On SelectorEventLoop, the resolution done by loop.sock_connect() is pretty weak anyways: it only takes the first resolved address, unlike loop.create_connection() that actually tries all the resolved addresses until one of them successfully connects.

    Users should use create_connection() or open_connection() if they want to avoid the complexities of address resolution. If they are reaching for low_level APIs like loop.sock_connect(), they should also handle loop.getaddrinfo() themselves.

    @miss-islington
    Copy link
    Contributor

    New changeset ac8eb8f by Miss Islington (bot) (Erwan Le Pape) in branch 'master':
    bpo-35545: Fix asyncio discarding IPv6 scopes (GH-11271)
    ac8eb8f

    @miss-islington
    Copy link
    Contributor

    New changeset 9470404 by Miss Islington (bot) in branch '3.7':
    bpo-35545: Fix asyncio discarding IPv6 scopes (GH-11271)
    9470404

    @aixtools
    Copy link
    Contributor

    from below:

    In case of 3.7 first call to _ensure_resolved returns
    ('fe80::1', 12345, 0, 1)
    then second call returns
    ('fe80::1', 12345, 0, 0)
    Notice that scope is now completely lost and is set to 0, thus actual call to socket.connect is wrong

    In case of 3.6 both first and second call to _ensure_resolved return
    ('fe80::1%lo', 12345, 0, 1)
    because in 3.6 case scope info is preserved in address and second call can derive correct address tuple

    I'll have to locate the PR I made to resolve the test issue AIX was having - but it seems the address format ::1%lo is not supported everywhere. FYI: I do not believe the PR was backported into 3.6.

    ** Found it:
    commit 413118e
    Author: Michael Felt <aixtools@users.noreply.github.com>
    Date: Fri Sep 14 01:35:56 2018 +0200

    Fix test_asyncio for AIX - do not call transport.get_extra_info('sockname') (bpo-8907)
    

    and
    [3.7] bpo-34490: Fix test_asyncio for AIX - do not call transport.get_extra_info('sockname') (GH-8907) bpo-9286

    Since in the first call - a scope of 1 is being returned - the initial "open" seems to be working as expected.

    Some "help" to be sure I do exactly the same tests.

    **** Reading through the bpo text, my change was only to skip the test because
    quote: On AIX with AF_UNIX family sockets getsockname() does not provide 'sockname'

    and, from memory, the information being looked for is the bit after the '%' - aka scope.

    On the one hand - the test is working - the information being returned does not match:

    ======================================================================
    FAIL: test_create_connection_ipv6_scope (test.test_asyncio.test_base_events.BaseEventLoopWithSelectorTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/mock.py", line 1226, in patched
        return func(*args, **keywargs)
      File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/test_asyncio/test_base_events.py", line 1316, in test_create_connection_ipv6_scope
        sock.connect.assert_called_with(('fe80::1', 80, 0, 1))
      File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/mock.py", line 838, in assert_called_with
        raise AssertionError(_error_message()) from cause
    AssertionError: expected call not found.
    Expected: connect(('fe80::1', 80, 0, 1))
    Actual: connect(('fe80::1', 80, 0, 0))

    What is not clear from the test is that what "expected" says, is not the same as the first address in the code:

            coro = self.loop.create_connection(asyncio.Protocol, 'fe80::1%1', 80)
            t, p = self.loop.run_until_complete(coro)
            try:
                sock.connect.assert_called_with(('fe80::1', 80, 0, 1))
                _, kwargs = m_socket.socket.call_args
                self.assertEqual(kwargs['family'], m_socket.AF_INET6)
                self.assertEqual(kwargs['type'], m_socket.SOCK_STREAM)
            finally:
                t.close()
                test_utils.run_briefly(self.loop)  # allow transport to close

    'fe80::1%1' <> 'fe80::1' - and maybe, on AIX - the initial connection failed. (or maybe it has to have succeeded, or the failure message would look different). I am not 'experienced' with IPv6 and scope.

    @aixtools
    Copy link
    Contributor

    On 22/05/2019 10:43, Michael Felt wrote:

    'fe80::1%1' <> 'fe80::1' - ... I am not 'experienced' with IPv6 and scope.

    From what I have just read (again) - scope seems to be a way to indicate
    the interface used (e.g., eth0, or enp0s25) as a "number".

    Further, getsockname() (and getpeername()) seem to be more for after a
    fork(), or perhaps after a pthread_create(). What remains unclear is why
    would I ever care what the scopeid is.  Is it because it is "shiney",
    does it add security (if so, how)?

    And, as this has been added - what breaks in Python when "scopeid" is
    not available?

    I am thinking, if adding a scopeid is a way to assign an IPv6 address to
    an interface - what is to prevent abuse? Why would I even want the same
    (link-local IP address on eth0 and eth1 at the same time? Assuming that
    it what it is making possible - the same IPv6/64 address on multiple
    interfaces and use scope ID to be more selective/aware. It this an
    alternative way to multiplex interfaces - now in the IP layer rather
    than in the LAN layer?

    If I understand why this is needed I may be able to come up with a way
    to "get it working" for the Python model of interfaces - although,
    probably not "fast".

    Regards,

    Michael

    @twisteroidambassador
    Copy link
    Mannequin

    twisteroidambassador mannequin commented May 22, 2019

    AFAIK the reason why scope id is required for IPv6 is that every IPv6
    interfaces has its own link-local address, and all these addresses are in
    the same subnet, so without an additional scope id there’s no way to tell
    from which interface an address can be reached. IPv4 does not have this
    problem because IPv4 interfaces usually don’t use link-local addresses.

    Michael Felt <report@bugs.python.org>于2019年5月22日 周三18:08写道:

    Michael Felt <aixtools@felt.demon.nl> added the comment:

    On 22/05/2019 10:43, Michael Felt wrote:
    > 'fe80::1%1' <> 'fe80::1' - ... I am not 'experienced' with IPv6 and
    scope.

    >From what I have just read (again) - scope seems to be a way to indicate
    the interface used (e.g., eth0, or enp0s25) as a "number".

    Further, getsockname() (and getpeername()) seem to be more for after a
    fork(), or perhaps after a pthread_create(). What remains unclear is why
    would I ever care what the scopeid is. Is it because it is "shiney",
    does it add security (if so, how)?

    And, as this has been added - what breaks in Python when "scopeid" is
    not available?

    I am thinking, if adding a scopeid is a way to assign an IPv6 address to
    an interface - what is to prevent abuse? Why would I even want the same
    (link-local IP address on eth0 and eth1 at the same time? Assuming that
    it what it is making possible - the same IPv6/64 address on multiple
    interfaces and use scope ID to be more selective/aware. It this an
    alternative way to multiplex interfaces - now in the IP layer rather
    than in the LAN layer?

    If I understand why this is needed I may be able to come up with a way
    to "get it working" for the Python model of interfaces - although,
    probably not "fast".

    Regards,

    Michael

    ----------


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue35545\>


    @twisteroidambassador
    Copy link
    Mannequin

    twisteroidambassador mannequin commented May 22, 2019

    With regards to the failing test, it looks like the test basically boils down to testing whether loop.getaddrinfo('fe80::1%1', 80, type=socket.SOCK_STREAM) returns (<socket.AF_INET6>, <socket.SOCK_STREAM>, *, *, ('fe80::1', 80, 0, 1)). This feels like a dangerous assumption to make, since it's tied to the operating system's behavior. Maybe AIX's getaddrinfo() in fact does not resolve scoped addresses correctly; maybe it only resolves scope ids correctly for real addresses that actually exist on the network; Maybe AIX assigns scope ids differently and do not use small integers; etc.

    @aixtools
    Copy link
    Contributor

    In hindsight, maybe the message could have been better,

    BUT - is it relevant?

    commit 413118e
    Author: Michael Felt <aixtools@users.noreply.github.com>
    Date: Fri Sep 14 01:35:56 2018 +0200

    Fix test_asyncio for AIX - do not call transport.get_extra_info('sockname') (bpo-8907)
    

    FYI:
    I have a server where "netstat -in" (aka ip a) does show an address with a scope component. Not figured out how to query that in C or python yet. (not my favorite thing - messing with socket() :p@me)

    Re: the example below - I would have thought the scopeid would be showing on en1, not en2 - and I am also wondering, if the scopeid is "%1" AIX ignores it. (also, I masked my global ipv6 address).
    Maybe en0 has a scopeid BECAUSE there is a global address (where en1 does not).

    michael@x071:[/home/michael]netstat -ni
    Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
    en0 1500 link#2 fa.d1.8c.f7.62.4 3103849 0 1085261 0 0
    en0 1500 192.168.129 192.168.129.71 3103849 0 1085261 0 0
    en0 1500 192.168.90 192.168.90.71 3103849 0 1085261 0 0
    en0 1500 MASK::1:f8d1:8cff:fef7:6204 3103849 0 1085261 0 0
    en0 1500 fe80::f8d1:8cff:fef7:6204%2 3103849 0 1085261 0 0
    en1 1500 link#3 fa.d1.8c.f7.62.5 12704 0 9323 0 0
    en1 1500 192.168.2 192.168.2.1 12704 0 9323 0 0
    en1 1500 fe80::f8d1:8cff:fef7:6205 12704 0 9323 0 0
    lo0 16896 link#1 3908 0 3908 0 0
    lo0 16896 127 127.0.0.1 3908 0 3908 0 0
    lo0 16896 ::1%1 3908 0 3908 0 0

    So, I looked at another server with two interfaces - here only one has a IPv6 address

    root@x064:[/home/root]netstat -in
    Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
    en0 1500 link#2 0.21.5e.a3.c7.44 119801 0 84874 0 0
    en0 1500 192.168.129 192.168.129.64 119801 0 84874 0 0
    en0 1500 fe80::221:5eff:fea3:c744 119801 0 84874 0 0
    en1 1500 link#3 fa.d1.81.81.ac.5 89362 0 48409 0 0
    en1 1500 192.168.2 192.168.2.64 89362 0 48409 0 0
    lo0 16896 link#1 139882 0 139881 0 0
    lo0 16896 127 127.0.0.1 139882 0 139881 0 0
    lo0 16896 ::1%1 139882 0 139881 0 0
    root@x064:[/home/root]

    And, after I activate IPv6 on the second interface - I see a scopeid-like representation:

    Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
    en0 1500 link#2 0.21.5e.a3.c7.44 120043 0 85045 0 0
    en0 1500 192.168.129 192.168.129.64 120043 0 85045 0 0
    en0 1500 fe80::221:5eff:fea3:c744 120043 0 85045 0 0
    en1 1500 link#3 fa.d1.81.81.ac.5 89370 0 48420 0 0
    en1 1500 192.168.2 192.168.2.64 89370 0 48420 0 0
    en1 1500 fe80::f8d1:81ff:fe81:ac05%2 89370 0 48420 0 0
    lo0 16896 link#1 139923 0 139922 0 0
    lo0 16896 127 127.0.0.1 139923 0 139922 0 0
    lo0 16896 ::1%1 139923 0 139922 0 0

    I would have to guess at this point, but to simplify, it seems that AIX resolves addresses differently (rather than say 'not correctly') and maybe requires specific conditions.

    If relevant - I can provide the output from Debian on POWER. But it seems AIX is only using a "ADDRv6%scopeid" when there at least two interfaces defined.

    +++++++++
    What the bot is not showing is this re: the "mock" connections 'failing':

    root@x066:[/data/prj/python/python3-3.8]./python -m test test_asyncio
    Run tests sequentially
    0:00:00 [1/1] test_asyncio
    /data/prj/python/git/python3-3.8/Lib/test/support/__init__.py:1627: RuntimeWarning: coroutine 'AsyncMockMixin._mock_call' was never awaited
      gc.collect()
    RuntimeWarning: Enable tracemalloc to get the object allocation traceback
    Future exception was never retrieved
    future: <Future finished exception=BrokenPipeError(32, 'Broken pipe')>
    Traceback (most recent call last):
      File "/data/prj/python/git/python3-3.8/Lib/asyncio/subprocess.py", line 162, in _feed_stdin
        await self.stdin.drain()
      File "/data/prj/python/git/python3-3.8/Lib/asyncio/streams.py", line 443, in drain
        await self._protocol._drain_helper()
      File "/data/prj/python/git/python3-3.8/Lib/asyncio/streams.py", line 200, in _drain_helper
        await waiter
      File "/data/prj/python/git/python3-3.8/Lib/asyncio/unix_events.py", line 661, in _write_ready
        n = os.write(self._fileno, self._buffer)
    BrokenPipeError: [Errno 32] Broken pipe
    Future exception was never retrieved
    future: <Future finished exception=BrokenPipeError(32, 'Broken pipe')>
    Traceback (most recent call last):
      File "/data/prj/python/git/python3-3.8/Lib/asyncio/subprocess.py", line 162, in _feed_stdin
        await self.stdin.drain()
      File "/data/prj/python/git/python3-3.8/Lib/asyncio/streams.py", line 443, in drain
        await self._protocol._drain_helper()
      File "/data/prj/python/git/python3-3.8/Lib/asyncio/streams.py", line 200, in _drain_helper
        await waiter
      File "/data/prj/python/git/python3-3.8/Lib/asyncio/unix_events.py", line 661, in _write_ready
        n = os.write(self._fileno, self._buffer)
    BrokenPipeError: [Errno 32] Broken pipe
    test test_asyncio failed -- Traceback (most recent call last):
      File "/data/prj/python/git/python3-3.8/Lib/unittest/mock.py", line 1226, in patched
        return func(*args, **keywargs)
      File "/data/prj/python/git/python3-3.8/Lib/test/test_asyncio/test_base_events.py", line 1316, in test_create_connection_ipv6_scope
        sock.connect.assert_called_with(('fe80::1', 80, 0, 1))
      File "/data/prj/python/git/python3-3.8/Lib/unittest/mock.py", line 838, in assert_called_with
        raise AssertionError(_error_message()) from cause
    AssertionError: expected call not found.
    Expected: connect(('fe80::1', 80, 0, 1))
    Actual: connect(('fe80::1', 80, 0, 0))

    FYI: I have IPv6 interfaces defined on this server (x066) - but only one. And I tried changing fe80::1%1 to fe80::1%2, etc, but the end result is similar:

    AssertionError: expected call not found.
    Expected: connect(('fe80::1', 80, 0, 2))
    Actual: connect(('fe80::1', 80, 0, 0))

    Hope this helps!

    @lepaperwan
    Copy link
    Mannequin

    lepaperwan mannequin commented May 24, 2019

    I don't have an AIX lying around to test so would you mind just running the test on getaddrinfo for AIX. A simple python3 -c 'import socket; print(socket.getaddrinfo("fe80::1%1", 80))' should fairly rapidly determine if there is a legitimate reason for the test to fail (ie. this is internal to asyncio) or if this is tied to the underlying AIX getaddrinfo.

    The IPv6 Scoped Address Architecture RFC clearly indicates that <addr>%<zone> should be supported although it isn't a must. Hopefully there's a subtlety to getaddrinfo on AIX (maybe in the way the zone should be specified, I already had to fallback to numeric interfaces so the test would work on both Linux & Windows, I wouldn't be surprised if AIX had yet another syntax for it).

    Also, it would be worthwhile to ensure that the patches mentioned by IBM https://www-01.ibm.com/support/docview.wss?uid=isg1IV52116 are applied on the machine running the test.

    @aixtools
    Copy link
    Contributor

    On 24/05/2019 19:59, Erwan Le Pape wrote:

    python3 -c 'import socket; print(socket.getaddrinfo("fe80::1%1", 80))'`

    root@x067:[/home/root]python3 -c 'import socket;
    print(socket.getaddrinfo("fe80::1%1", 80))'
    [(<AddressFamily.AF_INET6: 24>, <SocketKind.SOCK_DGRAM: 2>, 17, '',
    ('fe80::1', 80, 0, 0))]

    I have not yet checked if the patches mentioned are installed.

    This is a system I am testing PyInstaller, and the python3 version is 3.6.8.

    OS-Level is: 7100-03-05-1524, but it was built on a different version of
    AIX.

    +++++++

    This is the system I have the buildbot on:

    buildbot@x064:[/home/buildbot/aixtools-master]./python
    Python 3.8.0a4+ (heads/bpo-37009-thread-safe-dirty:b489efab81, May 22
    2019, 15:13:31)
    [GCC 4.7.4] on aix
    Type "help", "copyright", "credits" or "license" for more information.

    >>
    buildbot@x064:[/home/buildbot/aixtools-master]oslevel -s
    7100-04-06-1806
    buildbot@x064:[/home/buildbot/aixtools-master]

    buildbot@x064:[/home/buildbot/aixtools-master]./python -c 'import
    socket; print(socket.getaddrinfo("fe80::1%1", 80))'
    [(<AddressFamily.AF_INET6: 24>, <SocketKind.SOCK_DGRAM: 2>, 17, '',
    ('fe80::1', 80, 0, 0))]

    +++++

    re the patches mentioned.

    a) not applicable for the systems above - both are AIX 7.1.

    b) the AIX 6.1 TL7 I build with says:

    root@x066:[/]instfix -i | grep IV52116
        All filesets for IV52116 were found.

    @aixtools
    Copy link
    Contributor

    On 24/05/2019 19:59, Erwan Le Pape wrote:

    python3 -c 'import socket; print(socket.getaddrinfo("fe80::1%1", 80))'`

    p.s. I used an actual address:

    buildbot@x064:[/home/buildbot/aixtools-master]netstat -ni
    Name  Mtu   Network     Address            Ipkts Ierrs    Opkts Oerrs  Coll
    en0   1500  link#2      0.21.5e.a3.c7.44   191897     0   171570     0     0
    en0   1500  192.168.129 192.168.129.64     191897     0   171570     0     0
    en0   1500  fe80::221:5eff:fea3:c744       191897     0   171570     0     0
    en1   1500  link#3      fa.d1.81.81.ac.5   147474     0    80440     0     0
    en1   1500  192.168.2   192.168.2.64       147474     0    80440     0     0
    en1   1500  fe80::f8d1:81ff:fe81:ac05%2    147474     0    80440     0     0
    lo0   16896 link#1                         184523     0   184521     0     0
    lo0   16896 127         127.0.0.1          184523     0   184521     0     0
    lo0   16896 ::1%1                          184523     0   184521     0     0
    buildbot@x064:[/home/buildbot/aixtools-master]./python -c 'import
    socket; print(socket.getaddrinfo("fe80::f8d1:81ff:fe81:ac05%2", 80))'
    [(<AddressFamily.AF_INET6: 24>, <SocketKind.SOCK_DGRAM: 2>, 17, '',
    ('fe80::f8d1:81ff:fe81:ac05', 80, 0, 0))]

    @lepaperwan
    Copy link
    Mannequin

    lepaperwan mannequin commented May 24, 2019

    Thanks for testing that. It's good that you used an actual address because that eliminates the possibility that AIX doesn't handle addresses it doesn't really know about.

    On the other hand, even when properly specified to a real scoped IPv6 address, getaddrinfo doesn't seem to get the necessary scope ID from the underlying C call which socket.getaddrinfo > _socket.getaddrinfo is pretty much mapped to.

    I'm looking at cpython/master for the socketmodule implementation:

    socket_getaddrinfo(PyObject *self, PyObject *args, PyObject* kwargs)
    is getaddrinfo
    https://github.com/python/cpython/blob/master/Modules/socketmodule.c#L1294 is makesockaddr which actually creates the 4-tuple returned as the last element of the getaddrinfo tuples.
    The fourth element (ie. the scope ID) is clearly a->sin6_scope_id which should contain the scope ID.

    At this stage, I don't know if this is a bug from the socketmodule which I doubt or if the AIX getaddrinfo simply just doesn't handle scoped IP addresses properly.

    If you're still okay to proxy tests for AIX, I'll try and come up with either a simple C snippet to see what's in the returned structure or ctype the AIX libc getaddrinfo.

    @aixtools
    Copy link
    Contributor

    No problem with trying out your tests.

    Sent from my iPhone

    On 25 May 2019, at 00:19, Erwan Le Pape report@bugs.python.org wrote:

    Erwan Le Pape lepaperwan3@gmail.com added the comment:

    Thanks for testing that. It's good that you used an actual address because that eliminates the possibility that AIX doesn't handle addresses it doesn't really know about.

    On the other hand, even when properly specified to a real scoped IPv6 address, getaddrinfo doesn't seem to get the necessary scope ID from the underlying C call which socket.getaddrinfo > _socket.getaddrinfo is pretty much mapped to.

    I'm looking at cpython/master for the socketmodule implementation:

    socket_getaddrinfo(PyObject *self, PyObject *args, PyObject* kwargs)
    is getaddrinfo
    https://github.com/python/cpython/blob/master/Modules/socketmodule.c#L1294 is makesockaddr which actually creates the 4-tuple returned as the last element of the getaddrinfo tuples.
    The fourth element (ie. the scope ID) is clearly a->sin6_scope_id which should contain the scope ID.

    At this stage, I don't know if this is a bug from the socketmodule which I doubt or if the AIX getaddrinfo simply just doesn't handle scoped IP addresses properly.

    If you're still okay to proxy tests for AIX, I'll try and come up with either a simple C snippet to see what's in the returned structure or ctype the AIX libc getaddrinfo.

    ----------


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue35545\>


    @aixtools
    Copy link
    Contributor

    On 25/05/2019 00:19, Erwan Le Pape wrote:

    Erwan Le Pape lepaperwan3@gmail.com added the comment:

    Thanks for testing that. It's good that you used an actual address because that eliminates the possibility that AIX doesn't handle addresses it doesn't really know about.

    On the other hand, even when properly specified to a real scoped IPv6 address, getaddrinfo doesn't seem to get the necessary scope ID from the underlying C call which socket.getaddrinfo > _socket.getaddrinfo is pretty much mapped to.

    I'm looking at cpython/master for the socketmodule implementation:

    socket_getaddrinfo(PyObject *self, PyObject *args, PyObject* kwargs)
    is getaddrinfo
    https://github.com/python/cpython/blob/master/Modules/socketmodule.c#L1294 is makesockaddr which actually creates the 4-tuple returned as the last element of the getaddrinfo tuples.
    The fourth element (ie. the scope ID) is clearly a->sin6_scope_id which should contain the scope ID.

    At this stage, I don't know if this is a bug from the socketmodule which I doubt or if the AIX getaddrinfo simply just doesn't handle scoped IP addresses properly.

    I also doubt a bug in the socketmodule - my assumption is that AIX may
    be wrong - although I prefer different, i.e., has idiosyncrasies.

    ++ If we "accept" or "conclude" that AIX's getaddrinfo() routine is not
    working as needed for this test - would "you" (Python-core) accept a
    @SkipIf for this test - as is already done re: IPv6 re:

    bpo-34490 Fix test_asyncio for AIX - do not call transport.get_extra_info('sockname')

    ++ Further, I have a start on "send/receive" stubs in C and am trying
    out different ideas - learn as I go. "netstat" clearly shows, as does
    ifconfig -a

    root@x066:[/]ifconfig -a
    en0:
    flags=1e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>
            inet 192.168.129.66 netmask 0xffffff00 broadcast 192.168.129.255
            inet6 fe80::221:5eff:fea3:c746/64
             tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
    en1:
    flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
            inet6 fe80::f8d1:8cff:fe32:8305%2/64
    lo0:
    flags=e08084b,c0<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
            inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
            inet6 ::1%1/128
             tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1

    Sadly, I have a lot to learn re: IPv6 - and I expect how "host-based
    routing" is concerned*, so movement forward will be slow going.

    • from years ago, I recall a discussion where one of the improvements in
      IPv6 compared to IPv4 is that the "work" of routing would be shared by
      all end-points, rather than focused in router(-hubs) whose performance
      basically determine the performance limits of a connection (or connections).

    If you're still okay to proxy tests for AIX, I'll try and come up with either a simple C snippet to see what's in the returned structure or ctype the AIX libc getaddrinfo.
    Repeating - always willing.

    ----------


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue35545\>


    @lepaperwan
    Copy link
    Mannequin

    lepaperwan mannequin commented May 30, 2019

    Assuming similar configuration to the one in msg343430, a simple native getaddrinfo test to check whether any scope ID is returned.

    #include <arpa/inet.h>
    #include <sys/socket.h>
    #include <netdb.h>
    #include <stdio.h>
    
    
    void test(char *addrstr) {
        int status;
        struct addrinfo *res;
        struct addrinfo *iter;
        struct sockaddr_in6 *addr;
    
        status = getaddrinfo(addrstr, "80", NULL, &res);
        if (status != 0) {
            fprintf(stderr, "getaddrinfo(%s) returned %i\n", addrstr, status);
    
            return;
        }
    
        for (iter = res; iter; iter = iter->ai_next) {
            if (iter->ai_addr->sa_family != AF_INET6)
                continue;
    
            addr = (struct sockaddr_in6 *) iter->ai_addr;
            if (addr->sin6_scope_id != 0) {
                fprintf(stdout, "getaddrinfo(%s) return scope %u\n", addrstr, addr->sin6_scope_id);
    
                return;
            }
        }
    }
    
    int main() {
        test("fe80::f8d1:81ff:fe81:ac05%2");
        test("fe80::f8d1:81ff:fe81:ac05%en1");
    
        return 0;
    }
    

    I've explicitly tested against numeric and named interfaces to ensure that this isn't linked to AIX only handling named interfaces for scopes instead of numeric ones (although given your netstat output, that's be surprising).

    Since I've had to look for AIX programming documentation just to be sure, I also stumbled upon this AIX bug https://www-01.ibm.com/support/docview.wss?uid=isg1IV53671 (which is referenced by the one I mentioned previously but I missed that). It seem to apply up to 7100-03 so you should be immune nonetheless.

    I also noticed that it mentions SSH not working so I went and checked the OpenSSH sources to see how they handle AIX.
    While they have an explicit BROKEN_GETADDRINFO define, it doesn't check for the specific scoped IPv6 address issue so I'm not sure they decided to special case it.
    https://github.com/openssh/openssh-portable/blob/85ceb0e64bff672558fc87958cd548f135c83cdd/configure.ac#L2341

    If this is truly an "idiosyncrasy" in AIX, I'm not sure there is a better way to handle it than skipping it since it's not really a Python bug if the underlying libc doesn't work as intended.

    @aixtools
    Copy link
    Contributor

    On 30/05/2019 10:27, Erwan Le Pape wrote:

    Erwan Le Pape <lepaperwan3@gmail.com> added the comment:

    Assuming similar configuration to the one in msg343430, a simple native getaddrinfo test to check whether any scope ID is returned.

    The 'expanded' program ... main():

    int main() {
    /* local addresses */
        test("fe80::221:5eff:fea3:c746%0");
        test("fe80::221:5eff:fea3:c746%en0");
        test("fe80::f8d1:8cff:fe32:8305%2");
        test("fe80::f8d1:8cff:fe32:8305%en1");
    /* remote addresses */
        test("fe80::f8d1:81ff:fe81:ac05%2");
        test("fe80::f8d1:81ff:fe81:ac05%en1");

        return 0;
    }

    The conclusion seems to be that the scopeid returned is always 0 - when
    it is working;
    The status is always "8", when it fails.

    This seems to be:
    #define EAI_NONAME      8       /* hostname nor servname not provided,
    or not known */

    So, %enX is not recognized - only a numerical scope.

    +++ Details +++

    On the first server - added two addresses - they are local to platform.

    root@x066:[/data/prj/aixtools/tests/tcpip/socket]cc -g ex03.c -o ex03
    root@x066:[/data/prj/aixtools/tests/tcpip/socket]./ex03
    getaddrinfo(fe80::221:5eff:fea3:c746%en0) returned 8
    getaddrinfo(fe80::f8d1:8cff:fe32:8305%en1) returned 8
    getaddrinfo(fe80::f8d1:81ff:fe81:ac05%en1) returned 8
    root@x066:[/data/prj/aixtools/tests/tcpip/socket]netstat -ni
    Name  Mtu   Network     Address            Ipkts Ierrs    Opkts Oerrs  Coll
    en0   1500  link#2      0.21.5e.a3.c7.46  1496455     0  1214300     0     0
    en0   1500  192.168.129 192.168.129.66    1496455     0  1214300     0     0
    en0   1500  fe80::221:5eff:fea3:c746      1496455     0  1214300     0     0
    en1   65390 link#3      fa.d1.8c.32.83.5     4041     0       34     0     0
    en1   65390 fe80::f8d1:8cff:fe32:8305%2      4041     0       34     0     0
    lo0   16896 link#1                         160253     0   160252     0     0
    lo0   16896 127         127.0.0.1          160253     0   160252     0     0
    lo0   16896 ::1%1                          160253     0   160252     0     0
    root@x066:[/data/prj/aixtools/tests/tcpip/socket]oslevel -s
    6100-07-07-1316
    +++ Note +++ the 5th field says below (-), equal (=), or exceeded (+)
     - so on this server they are equal, on the AIX 7.1 TL4 - exceeded.
    root@x066:[/data/prj/aixtools/tests/tcpip/socket]instfix -ciqk IV52116
    IV52116:bos.64bit:6.1.7.21:6.1.7.21:=:GETADDRINFO AND INET_PTON CANNOT
    HANDLE IPV6 SCOPE/ZONE
    IV52116:bos.rte.control:6.1.7.21:6.1.7.21:=:GETADDRINFO AND INET_PTON
    CANNOT HANDLE IPV6 SCOPE/ZONE
    IV52116:bos.rte.libc:6.1.7.21:6.1.7.21:=:GETADDRINFO AND INET_PTON
    CANNOT HANDLE IPV6 SCOPE/ZONE
    IV52116:bos.rte.shell:6.1.7.22:6.1.7.22:=:GETADDRINFO AND INET_PTON
    CANNOT HANDLE IPV6 SCOPE/ZONE
    IV52116:mcr.rte:6.1.7.21:6.1.7.21:=:GETADDRINFO AND INET_PTON CANNOT
    HANDLE IPV6 SCOPE/ZONE

    On a second server (all addresses are 'remote now')
    root@x064:[/data/prj/aixtools/tests/tcpip/socket]netstat -ni
    Name  Mtu   Network     Address            Ipkts Ierrs    Opkts Oerrs  Coll
    en0   1500  link#2      0.21.5e.a3.c7.44   765518     0   792062     0     0
    en0   1500  192.168.129 192.168.129.64     765518     0   792062     0     0
    en0   1500  fe80::221:5eff:fea3:c744       765518     0   792062     0     0
    en1   1500  link#3      fa.d1.81.81.ac.5   773516     0   422335     0     0
    en1   1500  192.168.2   192.168.2.64       773516     0   422335     0     0
    en1   1500  fe80::f8d1:81ff:fe81:ac05%2    773516     0   422335     0     0
    lo0   16896 link#1                         410599     0   410596     0     0
    lo0   16896 127         127.0.0.1          410599     0   410596     0     0
    lo0   16896 ::1%1                          410599     0   410596     0     0

    root@x064:[/data/prj/aixtools/tests/tcpip/socket]./ex03
    getaddrinfo(fe80::221:5eff:fea3:c746%en0) returned 8
    gai_strerror:Hostname and service name not provided or found
    getaddrinfo(fe80::f8d1:8cff:fe32:8305%en1) returned 8
    gai_strerror:Hostname and service name not provided or found
    getaddrinfo(fe80::f8d1:81ff:fe81:ac05%en1) returned 8
    gai_strerror:Hostname and service name not provided or found

    root@x064:[/data/prj/aixtools/tests/tcpip/socket]oslevel -s
    7100-04-06-1806
    root@x064:[/data/prj/aixtools/tests/tcpip/socket]instfix -ciqk IV53671
    IV53671:bos.64bit:7.1.3.15:7.1.4.33:+:getaddrinfo cannot handle IPv6
    scope/zone
    IV53671:bos.rte.control:7.1.3.15:7.1.4.33:+:getaddrinfo cannot handle
    IPv6 scope/zone
    IV53671:bos.rte.libc:7.1.3.15:7.1.4.33:+:getaddrinfo cannot handle IPv6
    scope/zone
    IV53671:bos.rte.shell:7.1.3.15:7.1.4.33:+:getaddrinfo cannot handle IPv6
    scope/zone
    IV53671:mcr.rte:7.1.3.15:7.1.4.33:+:getaddrinfo cannot handle IPv6
    scope/zone

    And a server with the bug - i.e., not fixed:
    root@x065:[/data/prj/aixtools/tests/tcpip/socket]./ex03
    getaddrinfo(fe80::221:5eff:fea3:c746%0) returned 8
    getaddrinfo(fe80::221:5eff:fea3:c746%en0) returned 8
    getaddrinfo(fe80::f8d1:8cff:fe32:8305%2) returned 8
    getaddrinfo(fe80::f8d1:8cff:fe32:8305%en1) returned 8
    getaddrinfo(fe80::f8d1:81ff:fe81:ac05%2) returned 8
    getaddrinfo(fe80::f8d1:81ff:fe81:ac05%en1) returned 8
    root@x065:[/data/prj/aixtools/tests/tcpip/socket]oslevel -s
    5300-07-00-0000

    *** In closing ***
    Maybe AIX needs "hints" to reveal the scopeid. There is a lot of 'talk'
    about that
    in the man page. I can attach that in a new email if you do not have that.

    Regards,
    Michael

    @asvetlov
    Copy link
    Contributor

    Guys, thank you for investigation.
    If there is AIX "idiosyncrasy" -- please feel free to skip failed tests on AIX.

    If you have access to AIX box it would be much easier for you. I can only look at Python buildbot statuses.

    @aixtools
    Copy link
    Contributor

    On 30/05/2019 23:11, Andrew Svetlov wrote:

    Andrew Svetlov <andrew.svetlov@gmail.com> added the comment:

    Guys, thank you for investigation.
    If there is AIX "idiosyncrasy" -- please feel free to skip failed tests on AIX.

    If you have access to AIX box it would be much easier for you. I can only look at Python buildbot statuses.
    OK. I'll setup a skip test - but also try to get more info from IBM
    and/or discover the root cause myself.

    ----------


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue35545\>


    @miss-islington
    Copy link
    Contributor

    New changeset 32dda26 by Miss Islington (bot) (Michael Felt) in branch 'master':
    bpo-35545: Skip test_asyncio.test_create_connection_ipv6_scope on AIX (GH-14011)
    32dda26

    @miss-islington
    Copy link
    Contributor

    New changeset 70a4178 by Miss Islington (bot) in branch '3.8':
    bpo-35545: Skip test_asyncio.test_create_connection_ipv6_scope on AIX (GH-14011)
    70a4178

    @ZaarHai
    Copy link
    Mannequin

    ZaarHai mannequin commented Jul 10, 2019

    Good day guys,
    Does anyone have an idea if it's going to be fixed for 3.8?

    @aixtools
    Copy link
    Contributor

    aixtools commented Aug 5, 2019

    I did not ask back in June - but could this also be backported to 3.7. I am trying very hard to have all tests also passing on 3.7. as @asvetlov is ok with a skipped test for AIX - see https://bugs.python.org/issue35545#msg344003

    I can make the backport, if needed.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes topic-asyncio type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants