msg95245 - (view) |
Author: David M. Beazley (beazley) |
Date: 2009-11-14 17:16 |
Consider a socket that has had a file-like wrapper placed around it
using makefile()
# s is a socket created previously
f = s.makefile()
Now, suppose that this socket has had a timeout placed on it.
s.settimeout(15)
If you try to read data from f, but nothing is available. You'll
eventually get a timeout. For example:
f.readline() # Now, just wait
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File
"/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/socket.
py", line 406, in readline
data = self._sock.recv(self._rbufsize)
socket.timeout: timed out
However, now consider the case where you're reading a line of data, but
the receiver has only received a partial line and it's waiting for the
rest of the data to arrive. For example, type this:
f.readline()
Now, go to the other end of the socket connection and send a buffer with
no newline character. For example, send the message "Hello".
Since no newline character has been received, the readline() method will
eventually fail with a timeout as before. However, if you now retry
the read operation f.readline() and send more data such as the message
"World\n", you'll find that the "Hello" message gets lost. In other
words, the repeated readline() operation discards any buffers
corresponding to previously received line data and just returns the new
data.
Admittedly this is a corner case, but you probably don't want data to be
discarded on a TCP connection even if a timeout occurs.
Hope that makes some sense :-). (It helps to try it out).
|
msg121777 - (view) |
Author: Roy Smith (roysmith) |
Date: 2010-11-20 21:08 |
I'm looking into this now. In the meantime, I've opened a marginally-related bug, issue10473
|
msg121779 - (view) |
Author: Roy Smith (roysmith) |
Date: 2010-11-20 21:14 |
Ataching a test case which demonstrates the bug.
|
msg121833 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2010-11-21 02:05 |
This would seem to be an invalid test case. It is specifically documented that socket.makefile does not support this: "The socket must be in blocking mode (it can not have a timeout)".
http://docs.python.org/py3k/library/socket.html#socket.socket.makefile
I suppose socket.makefile could initially check the socket and throw an exception if the socket is non_blocking but the program could later change the socket to non_blocking and the same issue would presumably arise.
Recommend closing as invalid.
|
msg121839 - (view) |
Author: Roy Smith (roysmith) |
Date: 2010-11-21 02:36 |
This is kind of ugly. On the one hand, I'm all for adding a check in makefile() to catch it being called on a non-blocking socket.
On the other hand, you are correct that a user could change the mode leter. Even if we added checks for this in socket.setblocking(), there's plenty of ways to get around that; it's easy to grab a raw file descriptor and do whatever you want with it behind the library's back.
On the third hand, maybe a check could be added to SocketIO.readinto() to verify that the socket was in blocking mode each time it was called?
|
msg121885 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2010-11-21 08:45 |
I see Issue7995 also addresses the issue of accept sockets inheriting nonblocking status and provides a suggested patch.
|
msg124957 - (view) |
Author: Ross Lagerwall (rosslagerwall) |
Date: 2010-12-31 08:14 |
Attached is a patch which fixes the issue.
Instead of allowing the readline method to lose data, it adds a check to SocketIO.readinto() to ensure that the socket does not have a timeout and throws an IOError if it does. Also does the same for SocketIO.write().
I think this is a better approach - just failing immediately when a readline on a nonblocking socket occurs instead of failing sometimes and losing data.
|
msg124962 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2010-12-31 10:47 |
While this patch looks conformant to the documentation, it is very likely to break code in the wild. Even in the stdlib, there are uses of makefile() + socket timeouts (e.g. in http.client and urllib). It would be better to find a way to make readline() functional even with socket timeouts.
|
msg126163 - (view) |
Author: Ross Lagerwall (rosslagerwall) |
Date: 2011-01-13 09:27 |
How about this?
Instead of just losing the data that's been read so far in readline(), this patch adds the data as a new field to the exception that is thrown - this way the semantics remain exactly the same but the data is not discarded when a timeout occurs, it is still accessible via the exception.
|
msg126170 - (view) |
Author: David Beazley (dabeaz) |
Date: 2011-01-13 14:10 |
Have any other programming environments ever had a feature where a socket timeout returns an exception containing partial data? I'm not aware of one offhand and speaking as a systems programmer, something like this might be somewhat unexpected.
My concern is that in the presence of timeouts, the programmer will be forced to reassemble the message themselves from fragments returned in the exception. However, one reason for using readline() in the first place is precisely so that you don't have to do that sort of thing.
Is there any reason why the input buffer can't be preserved across calls? You've already got a file-like wrapper around the socket. Just keep the unconsumed buffer in that instance.
|
msg126171 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-01-13 14:24 |
This is an interesting approach. The problem is that AFAICT the issue is not limited to readline. If you call e.g. read(10000) and the socket times out after having returned the first 5000 bytes, then those 5000 bytes might get lost as well (depending on specifics e.g. buffer size in the IO stack).
Generally there is no guarantee that a buffered object works "properly" when the raw IO object raises some exception intermittently; perhaps this should be fixed in a systemic way, although this would complicate things quite a bit.
Also, I don't think people try to reuse a socket after a timeout (or even try to salvage whatever data could be read before the timeout); usually they would instead abort the connection and treat the remote resource as unavailable. IMO, that's the one obvious use case for socket timeouts.
|
msg126172 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-01-13 14:27 |
By the way, I recently fixed the makefile() documentation:
“The socket must be in blocking mode; it can have a timeout, but the file object’s internal buffer may end up in a inconsistent state if a timeout occurs.”
(in http://docs.python.org/dev/library/socket.html#socket.socket.makefile)
I also added a small section dedicated to socket timeouts:
http://docs.python.org/dev/library/socket.html#notes-on-socket-timeouts
|
msg126190 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2011-01-13 17:47 |
"""Generally there is no guarantee that a buffered object works "properly" when the raw IO object raises some exception intermittently"""
I disagree. EINTR is a classic case of this and is something that buffering IO layers deal with all the time. (readline is just one example of a buffering io layer)
if there is a timeout and we can't determine if there is enough data to return for readline, we should buffer it and not return.
maybe this means we need to disallow readline() with timeouts on unbuffered sockets since we can't determine if data will need to be buffered or not due to such a condition in advance.
The normal behavior for code calling readline() on a socket with a timeout is likely going to be to close it. Anything else does not make much sense. (someone may try, but really they're writing their I/O code wrong if they are using a socket timeout a poor form of task switching ;)
|
msg126192 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-01-13 18:06 |
> """Generally there is no guarantee that a buffered object works
> "properly" when the raw IO object raises some exception
> intermittently"""
>
> I disagree. EINTR is a classic case of this and is something that
> buffering IO layers deal with all the time. (readline is just one
> example of a buffering io layer)
EINTR is a different matter. To handle EINTR in Python, it is enough to
call the signal handlers and then retry the system call (that's what is
done in SocketIO.readinto, although FileIO doesn't have such logic).
Only if the signal handler raises an exception (which it probably
shouldn't do, since asynchronous exceptions are very bad) do you abort
the operation.
You can't apply the same logic to a socket timeout; the timeout is
really an error condition and you certainly shouldn't retry the system
call (that would defeat the point of using a timeout). So, to handle it
in an entirely correct way, you need to add some out-of-band buffering
logic where you store the pending raw reads which have been done but
could not be returned to the user. That complicates things quite a bit,
especially given that it has to be grafted on at least two layers of the
IO stack (the raw IO layer, and the buffered IO layer). Ross' patch does
it, but incompletely (it lets the user handle the out-of-band data) and
only for readline() (while buffered read() would probably need it too).
> The normal behavior for code calling readline() on a socket with a
> timeout is likely going to be to close it. Anything else does not
> make much sense. (someone may try, but really they're writing their
> I/O code wrong if they are using a socket timeout a poor form of task
> switching ;)
That's my opinion too. So, instead, of doing the above surgery inside
the IO stack, the SocketIO layer could detect the timeout and disallow
further access. What do you think?
|
msg126197 - (view) |
Author: Ross Lagerwall (rosslagerwall) |
Date: 2011-01-13 20:25 |
> That complicates things quite a bit,
> especially given that it has to be grafted on at least two layers of the
> IO stack (the raw IO layer, and the buffered IO layer).
Also the TextIO layer I think.
> That's my opinion too. So, instead, of doing the above surgery inside
> the IO stack, the SocketIO layer could detect the timeout and disallow
> further access. What do you think?
So after a timeout occurs the file-object basically becomes worthless? Would it make sense to automatically call the close method of the file-object after this occurs?
|
msg126198 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-01-13 20:34 |
> > That's my opinion too. So, instead, of doing the above surgery inside
> > the IO stack, the SocketIO layer could detect the timeout and disallow
> > further access. What do you think?
>
> So after a timeout occurs the file-object basically becomes worthless?
> Would it make sense to automatically call the close method of the
> file-object after this occurs?
Actually, we only need to forbid further reads (writes would always
work). I think we should still let the user call the close method
themselves.
|
msg126228 - (view) |
Author: Ross Lagerwall (rosslagerwall) |
Date: 2011-01-14 05:21 |
Attached patch disallows further reads after a timeout.
|
msg126296 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-01-14 19:51 |
Looks good and simple enough. I would probably shift the timeout test after the closed test, but that's almost a detail.
|
msg126297 - (view) |
Author: David Beazley (dabeaz) |
Date: 2011-01-14 20:00 |
Just wanted to say that I agree it's nonsense to continue reading on a socket that timed out (I'm not even sure what I might have been thinking when I first submitted this bug other than just experimenting with edge cases of the socket interface). It's still probably good to precisely specify what the behavior is in any case.
|
msg129468 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-02-25 23:15 |
Committed in r88622 (3.3) and r88623 (3.2). The 2.7 implementation is too different for the patch to apply, so if you want to fix it too, feel free to upload a patch. Thank you!
|
msg146025 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2011-10-20 16:27 |
This patch has caused a non-trivial regression between 3.2 and 3.2.1. The scenario in which I observed it is poplib. I create a POP3 connection with a timeout. At one point in its processing, poplib is reading lines until it gets a line '.\r\n', at which point the transaction is complete and it returns data to the caller. If the pop server fails to terminate the transaction, we get a timeout on the read. However, the POP server may still be alive, it may just have failed to close the transaction (servers have been observed in the wild that do this[*]). Before this patch, one could catch the socket.timeout and recover from the failed transaction (loosing the transaction data, but that's OK because the transaction was incomplete...it would be better to get the partial transaction, but that's a poplib issue, not a socket issue). One could then continue processing, sending new transactions to the POP server and getting responses. After the patch, once the socket error is raised there is no way to continue poplib processing short of tearing down the connection and rebuilding it, and restarting the POP processing from the beginning.
Now, this is clearly an abnormal situation (a POP server randomly not completing its transactions), but it was observed in the wild, and does represent a regression. I think that Antoine's idea of making readline functional despite timeouts was the better approach.
Also note that Antoine's change to the makefile documentation is wrong with this patch in place, since a timeout invalidates the makefile rather than just "leaving the internal buffers in an inconsistent state".
Backing out this patch would probably be better than leaving it in place, if a better fix can't be found.
[*] The regression was detected testing against a test POP server designed to exhibit defective behaviors that have been observed over the years by the maintainers of the test server. I can't point to specific existing servers that exhibit the broken behavior, but it did happen in the past and no doubt someone will write a buggy POP server that has the same broken behavior some time in the future as well.
|
msg146038 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-10-20 17:38 |
> One could then continue processing, sending new transactions to the POP
> server and getting responses.
That's optimistic. You don't know how much data has been lost in readline(). Sure, again your test server, it happens to work :) But again other kinds of failing servers, your protocol session would end up confused.
So the only robust option is the following:
> tearing down the
> connection and rebuilding it, and restarting the POP processing from
> the beginning
|
msg146040 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2011-10-20 18:01 |
I don't think it is optimistic. The poplib transaction pattern is: send a command, get a response. If the response is not properly terminated, throw it away. Send a new command, get a response. There's no ambiguity there. In addition, this is a common tcp client-server model, so I think it applies more widely than just poplib.
Please note that the timeout is *not* because the socket data transmission has timed out and data was lost in transit. There are no partially filled readline buffers in this scenario. The timeout is because the client is waiting for a *line* of data that the server never sends. Again, this is likely to be a common failure mode in tcp client/server applications, and to my mind is exactly what the timeout parameter to the constructor is most useful for.
|
msg146048 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2011-10-20 20:33 |
If the server failed to close a transaction the protocol stream is over
unless you mime relying on hope and luck. Poplib has a nasty set of server
implementation bugs to work around here.
Readline as defined today no longer suits its needs but I still strongly
believe the behavior of shutting reading down after a timeout is a good one.
One thing that would solve your common case: don't shut down reading on a
readline timeout if zero data was received into the internal line buffer.
Readline could indicate this by modifying the timeout exception being raised
to indicate if it can recover or not. The flag should only be set if it was
unrecoverable.
On Oct 20, 2011 11:01 AM, "R. David Murray" <report@bugs.python.org> wrote:
>
> R. David Murray <rdmurray@bitdance.com> added the comment:
>
> I don't think it is optimistic. The poplib transaction pattern is: send a
> command, get a response. If the response is not properly terminated, throw
> it away. Send a new command, get a response. There's no ambiguity there.
> In addition, this is a common tcp client-server model, so I think it
> applies more widely than just poplib.
>
> Please note that the timeout is *not* because the socket data transmission
> has timed out and data was lost in transit. There are no partially filled
> readline buffers in this scenario. The timeout is because the client is
> waiting for a *line* of data that the server never sends. Again, this is
> likely to be a common failure mode in tcp client/server applications, and to
> my mind is exactly what the timeout parameter to the constructor is most
> useful for.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue7322>
> _______________________________________
>
|
msg146049 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2011-10-20 20:40 |
Your suggestion sounds good to me.
I still think that it is a common failure mode in a client server transaction for the server to fail to send a (complete) line that the client is expecting, and vice versa, requiring a timeout, but not necessarily a "restart from scratch". Often the client/server protocol has a useful checkpoint to restart from short of start from scratch. In the case of many protocols, that would be "client issues a new command".
|
msg253437 - (view) |
Author: David Beazley (dabeaz) |
Date: 2015-10-25 21:28 |
This bug is still present in Python 3.5, but it occurs if you attempt to do a readline() on a socket that's in non-blocking mode. In that case, you probably DO want to retry at a later time (unlike the timeout case).
|
msg253530 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-10-27 11:52 |
IMO it might make sense in some cases to disallow subsequent reading from a buffered socket reader (or probably any BufferedReader) that raises an exception (not just a timeout). But the restriction seems unnecessary for unbuffered raw readers, and it also seems to go against the “consenting adults” philosophy for David Murray’s test server case.
David Beazley: For non-blocking sockets, the documentation currently says “the socket must be in blocking mode”. I’m not sure why that restriction is necessary; maybe it could be lifted at least for raw unbuffered streams.
Maybe you could make an argument for caching the partial data in the BufferedReader if a timeout (or no more non-blocking data, or other exception) occurs. The biggest problem is that it could mean storing more than the normal buffer size. I would think this would be a new feature (for 3.6+) rather than a behavioural bug fix though.
And see also Issue 13322 about inconsistencies with buffered reading non-blocking streams in general.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:54 | admin | set | github: 51571 |
2015-10-27 11:52:45 | martin.panter | set | nosy:
+ martin.panter messages:
+ msg253530
|
2015-10-25 21:28:19 | dabeaz | set | messages:
+ msg253437 |
2011-10-20 20:40:43 | r.david.murray | set | messages:
+ msg146049 |
2011-10-20 20:33:31 | gregory.p.smith | set | messages:
+ msg146048 |
2011-10-20 20:03:47 | flox | set | nosy:
+ flox
|
2011-10-20 18:01:14 | r.david.murray | set | messages:
+ msg146040 |
2011-10-20 17:38:34 | pitrou | set | messages:
+ msg146038 |
2011-10-20 16:27:47 | r.david.murray | set | status: closed -> open
nosy:
+ r.david.murray messages:
+ msg146025
resolution: fixed -> stage: resolved -> |
2011-02-25 23:15:19 | pitrou | set | status: open -> closed versions:
+ Python 3.2 nosy:
loewis, beazley, gregory.p.smith, amaury.forgeotdarc, roysmith, pitrou, ned.deily, dabeaz, rosslagerwall messages:
+ msg129468
resolution: fixed stage: patch review -> resolved |
2011-01-14 20:00:50 | dabeaz | set | nosy:
loewis, beazley, gregory.p.smith, amaury.forgeotdarc, roysmith, pitrou, ned.deily, dabeaz, rosslagerwall messages:
+ msg126297 |
2011-01-14 19:51:18 | pitrou | set | nosy:
loewis, beazley, gregory.p.smith, amaury.forgeotdarc, roysmith, pitrou, ned.deily, dabeaz, rosslagerwall messages:
+ msg126296 |
2011-01-14 05:21:00 | rosslagerwall | set | files:
+ i7322.patch nosy:
loewis, beazley, gregory.p.smith, amaury.forgeotdarc, roysmith, pitrou, ned.deily, dabeaz, rosslagerwall messages:
+ msg126228
|
2011-01-13 20:34:58 | pitrou | set | nosy:
loewis, beazley, gregory.p.smith, amaury.forgeotdarc, roysmith, pitrou, ned.deily, dabeaz, rosslagerwall messages:
+ msg126198 |
2011-01-13 20:25:02 | rosslagerwall | set | nosy:
loewis, beazley, gregory.p.smith, amaury.forgeotdarc, roysmith, pitrou, ned.deily, dabeaz, rosslagerwall messages:
+ msg126197 |
2011-01-13 18:06:31 | pitrou | set | nosy:
loewis, beazley, gregory.p.smith, amaury.forgeotdarc, roysmith, pitrou, ned.deily, dabeaz, rosslagerwall messages:
+ msg126192 |
2011-01-13 17:47:14 | gregory.p.smith | set | nosy:
loewis, beazley, gregory.p.smith, amaury.forgeotdarc, roysmith, pitrou, ned.deily, dabeaz, rosslagerwall messages:
+ msg126190 |
2011-01-13 14:27:50 | pitrou | set | nosy:
loewis, beazley, gregory.p.smith, amaury.forgeotdarc, roysmith, pitrou, ned.deily, dabeaz, rosslagerwall messages:
+ msg126172 |
2011-01-13 14:24:09 | pitrou | set | versions:
+ Python 3.3, - Python 3.1, Python 2.7, Python 3.2 nosy:
+ amaury.forgeotdarc
messages:
+ msg126171
stage: needs patch -> patch review |
2011-01-13 14:10:29 | dabeaz | set | nosy:
+ dabeaz messages:
+ msg126170
|
2011-01-13 09:27:59 | rosslagerwall | set | files:
+ issue7322_new.patch nosy:
loewis, beazley, gregory.p.smith, roysmith, pitrou, ned.deily, rosslagerwall messages:
+ msg126163
|
2010-12-31 10:47:38 | pitrou | set | assignee: gregory.p.smith -> messages:
+ msg124962 nosy:
loewis, beazley, gregory.p.smith, roysmith, pitrou, ned.deily, rosslagerwall |
2010-12-31 08:26:53 | rosslagerwall | set | nosy:
+ pitrou, loewis
|
2010-12-31 08:15:00 | rosslagerwall | set | files:
+ 7322_v1.patch
nosy:
+ rosslagerwall messages:
+ msg124957
keywords:
+ patch |
2010-11-21 08:45:24 | ned.deily | set | messages:
+ msg121885 |
2010-11-21 02:36:03 | roysmith | set | messages:
+ msg121839 |
2010-11-21 02:05:01 | ned.deily | set | nosy:
+ ned.deily messages:
+ msg121833
|
2010-11-20 21:14:11 | roysmith | set | files:
+ test-issue7322.py
messages:
+ msg121779 |
2010-11-20 21:08:57 | roysmith | set | nosy:
+ roysmith messages:
+ msg121777
|
2010-07-11 10:25:15 | BreamoreBoy | set | stage: needs patch versions:
+ Python 3.1, Python 2.7, Python 3.2, - Python 2.6 |
2009-11-14 20:16:10 | gregory.p.smith | set | priority: normal assignee: gregory.p.smith
nosy:
+ gregory.p.smith |
2009-11-14 17:16:31 | beazley | create | |