msg254357 - (view) |
Author: Jakub Stasiak (jstasiak) * |
Date: 2015-11-08 21:47 |
It is my understanding that socket.sendall effectively calls the underlying socket.send's implementation in a retry loop, possibly multiple times.
It is also my understanding that each one of those low level send calls can timeout on its own if a socket timeout is set.
Considering the above I believe it's undesired to ever call socket.sendall with a socket that has a timeout set because if:
1. At least one send call succeeds
2. A send call after that times out
then a socket timeout error will be raised and the information about the sent data will be lost.
Granted, the documentation says that "On error, an exception is raised, and there is no way to determine how much data, if any, was successfully sent". I believe, however, that such API is very easy to misuse (I've seen it used with sockets with timeout set, because of small payload sizes and other circumstances it would appear to work fine most of the time and only fail every N hours or days).
Possible improvements I see:
1. Explicitly mention interaction between socket's timeout and sendall's behavior in the documentation
2. Make sendall raise an error when the socket it's called with has a timeout set (I can see the backwards compatibility being a concern here but I can't think of any reason to remain compatible with behavior I believe to be broken anyway)
3. Both of the above
I'm happy to procure an appropriate patch for any of the above. Please correct me if my understanding of this is off.
|
msg254359 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2015-11-08 22:06 |
FYI the behaviour of socket.socket.sendall() regarding timeout has been modified in Python 3.5:
https://docs.python.org/dev/library/socket.html#socket.socket.sendall
"Changed in version 3.5: The socket timeout is no more reset each time data is sent successfuly. The socket timeout is now the maximum total duration to send all data."
|
msg254360 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-11-08 22:33 |
I don’t like the sound of improvement 2. I think it would break existing use cases, e.g. HTTPConnection(timeout=...), urlopen(timeout=...). If you want a HTTP request to abort if it takes a long time, how is that behaviour broken?
Regarding improvement 1, I think Victor’s 3.5 documentation explains how the timeout works fairly well. Perhaps you meant for this bug to be about pre-3.5 versions?
Also, see Issue 7322 for discussion of how to handle exceptions when reading from sockets and other streams. E.g. what should readline() do when it has read half a line and then gets an exception, and should you be allowed to read again after handling an exception? In my mind the readline() and sendall() cases are somewhat complementary.
|
msg254363 - (view) |
Author: Jakub Stasiak (jstasiak) * |
Date: 2015-11-09 01:19 |
Martin: While I'd consider timeout in HTTPConnection(timeout=...) or urlopen(timeout=...) to be the timeout for the entire operation, just just for the data sending part and HTTPConnection/urlopen can achieve the timeout behavior using just send I concede there may be valid use cases for "either sendall succeeds or we don't care about what we've sent anyway" and in this light my second suggestion is problematic.
Victor: The behavior change in 3.5 does't affect my concern from what I see. The concern is sendall timing out after some data has already been sent which can create some subtle issues. I've seen code like this:
def x(data, sock):
while True:
# some code here
try:
sock.sendall(data)
return
except timeout:
pass
Now I'll agree the code is at fault for ever attempting to retry sendall but I also think the API is easy to misuse like this. And it many cases it'll just work most of the time because sendall won't timeout.
Maybe explicitly mentioning sendall's behavior concerning sockets with timeouts could improve this? I'm honestly not sure anymore, technically "On error, an exception is raised, and there is no way to determine how much data, if any, was successfully sent." should be enough.
|
msg254364 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-11-09 01:45 |
Maybe it would be reasonable to expand on that “On error” sentence. I imagine the main errors that would cause this situation are a timeout as you say, and also a blocking exception. Also, you could point out that if you have lost record of how much has already been sent, it may not make sense to send any more data with the same socket.
|
msg254375 - (view) |
Author: Jakub Stasiak (jstasiak) * |
Date: 2015-11-09 09:17 |
This is exactly what I'm thinking. Do you think it's sensible to move that sentence + some additional information (following your suggestion) into a "Warning" block?
|
msg254411 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-11-09 21:16 |
Personally I would avoid big red warning boxes in 90% of the cases people suggest them, including this one. But maybe we can see what others think.
|
msg254413 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2015-11-09 21:24 |
I don't really understand why do even you consider the behaviour has a bug or a trap. On timeout, the most common behaviour is to give up on the whole client. I don't know code trying to resend the same data in case of timeout.
Describing the behaviour on the documentation helps, but no warning is need.
When you *read* data, it's different. It can be interesting to get the partial read.
--
For example, the asyncio.StreamReader.readexactly() coroutine raises an asyncio.IncompleteReadError exception which contains the partial data:
https://docs.python.org/dev/library/asyncio-stream.html#asyncio.StreamReader.readexactly
The HTTP client has a similar exception.
I proposed to add similar method to socket.socket which also raises a similar exception to return partial data: issue #1103213.
|
msg254437 - (view) |
Author: Jakub Stasiak (jstasiak) * |
Date: 2015-11-10 09:25 |
That's fair and thanks for the links.
Please find a quick patch attached, feel free to use that or any modification of it. While I believe the documentation is technically correct right now it won't hurt to clarify this I think.
|
msg254470 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-11-11 02:26 |
That was kind of what I had in mind. The only change I would make is to restore the comma to “On error (including socket timeout), an exception . . .”. I’ll try to commit this in a few days if nobody has anything else to say.
|
msg254521 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2015-11-12 08:09 |
I'm not a big fan of "may" in documentation. I would prefer to rephrase it as an advice. Example:
"Considering the loss of information, it's better to not retry sending data to the socket anymore."
|
msg254522 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2015-11-12 08:10 |
Note: socket-docs.patch file is strange, it looks like you removed the first line starting with "diff", and so we don't get the [Review] button.
|
msg254570 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-11-12 22:53 |
For what it’s worth, there could be obsure cases where sending more data might be okay. I prefer the original version (but can settle for either). Here’s a third alternative:
Considering this loss of information, it is generally best not to send any more data to the socket.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:23 | admin | set | github: 69772 |
2015-11-12 22:53:32 | martin.panter | set | messages:
+ msg254570 |
2015-11-12 08:10:17 | vstinner | set | messages:
+ msg254522 |
2015-11-12 08:09:36 | vstinner | set | messages:
+ msg254521 |
2015-11-11 02:26:43 | martin.panter | set | messages:
+ msg254470 stage: patch review |
2015-11-10 09:25:53 | jstasiak | set | files:
+ socket-docs.patch keywords:
+ patch messages:
+ msg254437
|
2015-11-09 21:24:08 | vstinner | set | messages:
+ msg254413 |
2015-11-09 21:16:44 | martin.panter | set | messages:
+ msg254411 |
2015-11-09 09:17:11 | jstasiak | set | messages:
+ msg254375 |
2015-11-09 01:45:25 | martin.panter | set | messages:
+ msg254364 versions:
- Python 3.2, Python 3.3 |
2015-11-09 01:19:15 | jstasiak | set | messages:
+ msg254363 |
2015-11-08 22:33:20 | martin.panter | set | nosy:
+ martin.panter messages:
+ msg254360
|
2015-11-08 22:06:36 | vstinner | set | nosy:
+ vstinner messages:
+ msg254359
|
2015-11-08 21:47:35 | jstasiak | create | |