classification
Title: BaseTransport.close() does not trigger connection_lost()
Type: behavior Stage: resolved
Components: asyncio Versions: Python 3.4
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: Sümer.Cip, anthonypjshaw, gvanrossum, vstinner, yselivanov
Priority: normal Keywords:

Created on 2016-02-03 15:57 by Sümer.Cip, last changed 2019-11-19 21:09 by vstinner. This issue is now closed.

Messages (5)
msg259487 - (view) Author: Sümer Cip (Sümer.Cip) Date: 2016-02-03 15:57
Hi all,

We have implemented a TCP server based on asyncio. And while doing some regression tests we randomly see following error:

1) Client connects to the server.
2) Client is closed ungracefully(without sending a FIN, deplug cable)
3) We have a custom PING handler that sends a PING and waits for PONG message.
4) After a while, we see that we timeout for the PING and we call close() on the Transport object. 

Now, most of the time, above just works fine, but at some point, somehow connection_lost() is NEVER gets called even though we call close() on the socket. As this issue is happening very randomly I don't have any asyncio logs for it. But can you think about any scenario that might lead to this somehow? 

Somehow, it seems we have an outgoing data in the TCP buffer when this happens and that is why the close() does not call the connection_lost immediately, but why it is never calling it is a mystery to me. Can that be following:

1) we call close() and is_closing is set to true we have outgoing data so we return.
2) Then a subsequent write occurs and connection ConnectionResetError() is raised and this calls _force_close(), but as we have previously set is_closing to True, connection_lost() does not get called.

Above is just a very trivial idea which is probably is not the case, I do not spend too much time on the code. 

Thanks,
msg341606 - (view) Author: anthony shaw (anthonypjshaw) * (Python triager) Date: 2019-05-06 19:02
This issue was never responded to, are you still having this issue? Which version of CPython are you using and can you please provide steps to reproduce the problem.
msg341690 - (view) Author: Sümer Cip (Sümer.Cip) Date: 2019-05-07 07:22
I do not know I still have the issue since I have circumvented the problem. I have been using Python3.4, I think it was one of the earliest asyncio implementations. The way it can be reproduced is as following: 
1) There are lots of active TCP connections connected to asyncio server (300-400 of them)
2) One of the clients close connection ungracefully(without sending a FIN)
3) We have a PING/PONG mechanism in the server (similar to http keep-alive), we call transport.close() on the socket if pong is not received within an interval.
connection_lost() event is never gets called for the socket. This is not happening all the time, this is a random issue, the key here is to disconnect client without sending a FIN and there is outgoing buffer for client.

Above is all I got.

Thanks!
msg356982 - (view) Author: Sümer Cip (Sümer.Cip) Date: 2019-11-19 17:45
Closing the issue seems like a good idea: as it seems nobody seems to have spotted similar issue and I have only been able to reproduce it in Python 3.4. 

Just for future ref.: The uncommon thing is that the server I was using is a TCP game server holding long-running connections as opposed to short-lived HTTP connections, maybe there is a very random issue at the core but as it is happening very randomly.
msg356990 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-11-19 21:09
If a bug cannot be reproduced, it cannot be fixed. So I close the issue ;-)
History
Date User Action Args
2019-11-19 21:09:04vstinnersetstatus: open -> closed
resolution: out of date
messages: + msg356990

stage: resolved
2019-11-19 17:45:37Sümer.Cipsetmessages: + msg356982
2019-05-07 07:22:29Sümer.Cipsetmessages: + msg341690
2019-05-06 19:02:39anthonypjshawsetnosy: + anthonypjshaw
messages: + msg341606
2016-02-03 15:57:36Sümer.Cipcreate