This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: The loop in utility `socket.create_connection()` swallows previous errors
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.9
process
Status: open Resolution: duplicate
Dependencies: Superseder: OSError: multiple exceptions should preserve the exception type if it is common
View: 29980
Assigned To: Nosy List: ankostis, vstinner, yselivanov
Priority: normal Keywords:

Created on 2017-03-08 14:45 by ankostis, last changed 2022-04-11 14:58 by admin.

Pull Requests
URL Status Linked Edit
PR 562 open ankostis, 2017-03-08 14:45
Messages (10)
msg289238 - (view) Author: Kostis Anagnostopoulos (ankostis) * Date: 2017-03-08 14:45
## Context
The utility method `socket.create_connection()` currently works like that:
1. resolve the destination-address into one or more IP(v4 & v6) addresses;
2. loop on each IP address and stop to the 1st one to work;
3. if none works, re-raise the last error.


## The problem
So currently the loop in `socket.create_connection()` ignores all intermediate errors and reports only the last connection failure, 
which  might be irrelevant.  For instance, when both IPv4 & IPv6 networks are supported, usually the last address is a IPv6 address and it frequently fails with an irrelevant error - the actual cause have already been ignored.


## Possible solutions & open questions
To facilitate network debugging, there are at least 3 options:
a. log each failure [as they happen](/python/cpython/blob/6f0eb93183519024cb360162bdd81b9faec97ba6/Lib/socket.py#L717), but that would get the final failure twice: once as a (warning?) message, and once as an exception .
b. collect all failures and log them only when connection fails to collect the errors, 
   but that might miss important infos to the user;
c. collect and return all failures in list attached to the raised exception.

A question for cases (a) & (b) is what logging "means" to use: the `warnings` or `logging` module?
And if `logging` is chosen, log them in `'DEBUG'` or `'WARNING'` level?

Case (c) sidesteps the above questions.
msg289312 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-03-09 17:28
> Case (c) sidesteps the above questions.

I like the (c) option. I don't think we should use logging/warnings here, as they can produce unwanted noise that few know how to silence.

When the list of errors is passed as a second argument to the exception, how is it rendered?  Would it make sense to concatenate all error messages:


  if errs:
      errors_msg = '  \n'.join(str(e) for e in errs)
      raise error(f"connection failed:\n{errors_msg}", errs)

Or maybe we should track which addr caused which exception?

  for res in getaddrinfo(host, port, 0, SOCK_STREAM):
      af, socktype, proto, canonname, sa = res
      try:
          ...
      except error as e:
          errs.append((canonname, e))
          ...

  if errs:
      error_msg = '  \n'.join(f'{addr}: {e}' for addr, e in errs)
      raise error('connection failed:\n{error_msg}', errs)
msg289313 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-03-09 17:28
This is a new feature, so we can only push it to 3.7.
msg289771 - (view) Author: Kostis Anagnostopoulos (ankostis) * Date: 2017-03-17 17:15
> When the list of errors is passed as a second argument to the exception, how is it rendered?  

This is how my latest ec887c0c3 looks on Linux:

    >>> import socket
    >>> socket.create_connection(('localhost', 12345))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.5/socket.py", line 714, in create_connection
        raise error("no connection possible due to %d errors" % nerr, errors)
    OSError: [Errno no connection possible due to 2 errors] [ConnectionRefusedError(111, 'Connection refused'), ConnectionRefusedError(111, 'Connection refused')]

And this is on Windows:

    >>> socket.create_connection(('localhost', 12345), 1)
    Traceback (most recent call last):
      File "D:\Apps\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
        exec(code_obj, self.user_global_ns, self.user_ns)
      File "<ipython-input-13-758ac642f3d5>", line 1, in <module>
        socket.create_connection(('localhost', 12345), 1)
      File "D:\Apps\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\socket.py", line 714, in create_connection
        raise error("no connection possible due to %d errors" % nerr, errors)
    OSError: [Errno no connection possible due to 2 errors] [timeout('timed out',), timeout('timed out',)]

    
> Would it make sense to concatenate all error messages:

But then the user will not receive a list of errors to inspect, but just a big string.
The biggest problem in my latest ec887c0c3 is that I'm abusing the 1st arg to OSError() constructor, instead of being an `errno` it is a string.
But I got that from the existing code.[1]

And still, this PR is not yer finished because there is no feedback on any intermediate errors in the case of success.
As suggested on the OP, this may happen (in my order of preference):

1. with a new argument for the user to provide the list to collect the errors (changes the API backward-compatiblly);
2. with logging logs;
3. with warnings;
4. do nothing.

I prefer logging over warnings because they are more configurable.

[1] https://github.com/python/cpython/blob/master/Lib/socket.py#L724
msg290009 - (view) Author: Kostis Anagnostopoulos (ankostis) * Date: 2017-03-22 19:20
> This is a new feature, so we can only push it to 3.7.

As it stands now(b39d4b1c6) it hardly contains any change - just in the case of multiple intermediate errors AND final failure, the exception raised is a bit different.

AFAICS it would be a "change" for 3.7 if my one of the 3 options of my last comment gets implemented.

Any ideas which one to implement?
msg292929 - (view) Author: Kostis Anagnostopoulos (ankostis) * Date: 2017-05-03 17:32
Just to remind that as it stands now(b39d4b1c6) it contains no API changes at all, so it should be considered for merge into 3.6.x line.
msg302089 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-13 16:22
I'm not excited to keep multiple exception objects alive because the Exception.__traceback__ causes many hairy reference cycle issues.

I just found an old bug in socket.create_connection():
https://github.com/python/cpython/pull/3546

See https://bugs.python.org/issue29639#msg302087 for my story of the bug.
msg335930 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-19 11:45
I propose the close the issue as WONTFIX. I don't see how the current behavior is an issue.

I expect a very large traceback just because a server is down/reject connections. I'm not sure that I want that by default.

> ## The problem
> So currently the loop in `socket.create_connection()` ignores all intermediate errors and reports only the last connection failure, 
which  might be irrelevant.  For instance, when both IPv4 & IPv6 networks are supported, usually the last address is a IPv6 address and it frequently fails with an irrelevant error - the actual cause have already been ignored.

It's not so hard to reimplement create_connection(): it's a loop on getaddrinfo(). So you can decide how to deal with errors.

--

I recall that Yury Selivanov, Nathnaniel Smith and Guido van Rossum discussed the idea of "MultiError": an object which contains multiple exceptions, object which would render properly all exceptions when a traceback is displayed.
msg336193 - (view) Author: Kostis Anagnostopoulos (ankostis) * Date: 2019-02-21 09:41
The problem of not fixing this (and just add a suggestion in the docs saying that the user may re-implement this method) is that frequently this call is hidden at the core of many networking libraries.  So monkey-patching is needed instead, which is not always nice or easy.


> I just found an old bug in socket.create_connection():
> https://github.com/python/cpython/pull/3546

Regarding #3546, i'm surprised that a function-local variable keeps its contents alive, long after the frame of that function has gone.  I thought that local-variables were deterministically (ref-countering) destructed.  
What is happening?
msg336206 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-21 11:21
> I thought that local-variables were deterministically (ref-countering) destructed. What is happening?

IT's a reference cycle. Exception => traceback => frame => exception. The 'err' variable kept the frame alive which kept the exception alive which kept everything alive. The GC is supposed to be able to break ref cycles, but GC collections are irregular and sometimes the GC fails to break complex cycles.
History
Date User Action Args
2022-04-11 14:58:44adminsetgithub: 73943
2021-05-22 17:19:34iritkatrielsetsuperseder: OSError: multiple exceptions should preserve the exception type if it is common
resolution: duplicate
2020-01-26 01:33:42cheryl.sabellasetversions: + Python 3.9, - Python 3.7, Python 3.8
2019-02-21 11:21:02vstinnersetmessages: + msg336206
2019-02-21 09:41:32ankostissetmessages: + msg336193
2019-02-20 20:17:35brett.cannonsetnosy: - brett.cannon
2019-02-19 11:45:06vstinnersetmessages: + msg335930
2019-02-16 23:18:23cheryl.sabellasetstage: patch review
type: enhancement
versions: + Python 3.8
2017-09-13 16:22:19vstinnersetmessages: + msg302089
2017-05-03 17:32:39ankostissetmessages: + msg292929
2017-03-22 19:20:32ankostissetmessages: + msg290009
2017-03-17 17:15:36ankostissetmessages: + msg289771
2017-03-09 17:28:53yselivanovsetnosy: + brett.cannon, vstinner

messages: + msg289313
versions: - Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6
2017-03-09 17:28:08yselivanovsetnosy: + yselivanov
messages: + msg289312
2017-03-08 14:45:17ankostiscreate