This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: socket.sendall broken when a socket has a timeout
Type: behavior Stage: patch review
Components: Documentation, IO, Library (Lib) Versions: Python 3.6, Python 3.4, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, jstasiak, martin.panter, vstinner
Priority: normal Keywords: patch

Created on 2015-11-08 21:47 by jstasiak, last changed 2022-04-11 14:58 by admin.

Files
File name Uploaded Description Edit
socket-docs.patch jstasiak, 2015-11-10 09:25 socket.sendall documentation extension
Messages (13)
msg254357 - (view) Author: Jakub Stasiak (jstasiak) * Date: 2015-11-08 21:47
It is my understanding that socket.sendall effectively calls the underlying socket.send's implementation in a retry loop, possibly multiple times.

It is also my understanding that each one of those low level send calls can timeout on its own if a socket timeout is set.

Considering the above I believe it's undesired to ever call socket.sendall with a socket that has a timeout set because if:

1. At least one send call succeeds
2. A send call after that times out

then a socket timeout error will be raised and the information about the sent data will be lost.

Granted, the documentation says that "On error, an exception is raised, and there is no way to determine how much data, if any, was successfully sent". I believe, however, that such API is very easy to misuse (I've seen it used with sockets with timeout set, because of small payload sizes and other circumstances it would appear to work fine most of the time and only fail every N hours or days).

Possible improvements I see:

1. Explicitly mention interaction between socket's timeout and sendall's behavior in the documentation
2. Make sendall raise an error when the socket it's called with has a timeout set (I can see the backwards compatibility being a concern here but I can't think of any reason to remain compatible with behavior I believe to be broken anyway)
3. Both of the above

I'm happy to procure an appropriate patch for any of the above. Please correct me if my understanding of this is off.
msg254359 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-11-08 22:06
FYI the behaviour of socket.socket.sendall() regarding timeout has been modified in Python 3.5:
https://docs.python.org/dev/library/socket.html#socket.socket.sendall

"Changed in version 3.5: The socket timeout is no more reset each time data is sent successfuly. The socket timeout is now the maximum total duration to send all data."
msg254360 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-08 22:33
I don’t like the sound of improvement 2. I think it would break existing use cases, e.g. HTTPConnection(timeout=...), urlopen(timeout=...). If you want a HTTP request to abort if it takes a long time, how is that behaviour broken?

Regarding improvement 1, I think Victor’s 3.5 documentation explains how the timeout works fairly well. Perhaps you meant for this bug to be about pre-3.5 versions?

Also, see Issue 7322 for discussion of how to handle exceptions when reading from sockets and other streams. E.g. what should readline() do when it has read half a line and then gets an exception, and should you be allowed to read again after handling an exception? In my mind the readline() and sendall() cases are somewhat complementary.
msg254363 - (view) Author: Jakub Stasiak (jstasiak) * Date: 2015-11-09 01:19
Martin: While I'd consider timeout in HTTPConnection(timeout=...) or urlopen(timeout=...) to be the timeout for the entire operation, just just for the data sending part and HTTPConnection/urlopen can achieve the timeout behavior using just send I concede there may be valid use cases for "either sendall succeeds or we don't care about what we've sent anyway" and in this light my second suggestion is problematic.

Victor: The behavior change in 3.5 does't affect my concern from what I see.  The concern is sendall timing out after some data has already been sent which can create some subtle issues. I've seen code like this:

def x(data, sock):
    while True:
        # some code here    
        try:
            sock.sendall(data)
            return
        except timeout:
            pass


Now I'll agree the code is at fault for ever attempting to retry sendall but I also think the API is easy to misuse like this. And it many cases it'll just work most of the time because sendall won't timeout.

Maybe explicitly mentioning sendall's behavior concerning sockets with timeouts could improve this? I'm honestly not sure anymore, technically "On error, an exception is raised, and there is no way to determine how much data, if any, was successfully sent." should be enough.
msg254364 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-09 01:45
Maybe it would be reasonable to expand on that “On error” sentence. I imagine the main errors that would cause this situation are a timeout as you say, and also a blocking exception. Also, you could point out that if you have lost record of how much has already been sent, it may not make sense to send any more data with the same socket.
msg254375 - (view) Author: Jakub Stasiak (jstasiak) * Date: 2015-11-09 09:17
This is exactly what I'm thinking. Do you think it's sensible to move that sentence + some additional information (following your suggestion) into a "Warning" block?
msg254411 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-09 21:16
Personally I would avoid big red warning boxes in 90% of the cases people suggest them, including this one. But maybe we can see what others think.
msg254413 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-11-09 21:24
I don't really understand why do even you consider the behaviour has a bug or a trap. On timeout, the most common behaviour is to give up on the whole client. I don't know code trying to resend the same data in case of timeout.

Describing the behaviour on the documentation helps, but no warning is need.

When you *read* data, it's different. It can be interesting to get the partial read.

--

For example, the asyncio.StreamReader.readexactly() coroutine raises an asyncio.IncompleteReadError exception which contains the partial data:

https://docs.python.org/dev/library/asyncio-stream.html#asyncio.StreamReader.readexactly

The HTTP client has a similar exception.

I proposed to add similar method to socket.socket which also raises a similar exception to return partial data: issue #1103213.
msg254437 - (view) Author: Jakub Stasiak (jstasiak) * Date: 2015-11-10 09:25
That's fair and thanks for the links.

Please find a quick patch attached, feel free to use that or any modification of it. While I believe the documentation is technically correct right now it won't hurt to clarify this I think.
msg254470 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-11 02:26
That was kind of what I had in mind. The only change I would make is to restore the comma to “On error (including socket timeout), an exception . . .”. I’ll try to commit this in a few days if nobody has anything else to say.
msg254521 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-11-12 08:09
I'm not a big fan of "may" in documentation. I would prefer to rephrase it as an advice. Example:

"Considering the loss of information, it's better to not retry sending data to the socket anymore."
msg254522 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-11-12 08:10
Note: socket-docs.patch file is strange, it looks like you removed the first line starting with "diff", and so we don't get the [Review] button.
msg254570 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-12 22:53
For what it’s worth, there could be obsure cases where sending more data might be okay. I prefer the original version (but can settle for either). Here’s a third alternative:

Considering this loss of information, it is generally best not to send any more data to the socket.
History
Date User Action Args
2022-04-11 14:58:23adminsetgithub: 69772
2015-11-12 22:53:32martin.pantersetmessages: + msg254570
2015-11-12 08:10:17vstinnersetmessages: + msg254522
2015-11-12 08:09:36vstinnersetmessages: + msg254521
2015-11-11 02:26:43martin.pantersetmessages: + msg254470
stage: patch review
2015-11-10 09:25:53jstasiaksetfiles: + socket-docs.patch
keywords: + patch
messages: + msg254437
2015-11-09 21:24:08vstinnersetmessages: + msg254413
2015-11-09 21:16:44martin.pantersetmessages: + msg254411
2015-11-09 09:17:11jstasiaksetmessages: + msg254375
2015-11-09 01:45:25martin.pantersetmessages: + msg254364
versions: - Python 3.2, Python 3.3
2015-11-09 01:19:15jstasiaksetmessages: + msg254363
2015-11-08 22:33:20martin.pantersetnosy: + martin.panter
messages: + msg254360
2015-11-08 22:06:36vstinnersetnosy: + vstinner
messages: + msg254359
2015-11-08 21:47:35jstasiakcreate