This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ftplib doesn't check close status after sending file
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ethan.furman, exarkun, giampaolo.rodola, gregory.p.smith, loewis, mikecmcleod, nagle, pitrou
Priority: normal Keywords: patch

Created on 2010-10-26 19:04 by nagle, last changed 2022-04-11 14:57 by admin.

Pull Requests
URL Status Linked Edit
PR 30747 closed mikecmcleod, 2022-01-21 13:23
Messages (10)
msg119638 - (view) Author: John Nagle (nagle) Date: 2010-10-26 19:04
"ftplib" doesn't check the status on socket close after writing.  This can lead to silently truncated files when sending files with "ftplib".

A report of truncated files on comp.lang.python led me to check the source code. 

The "ftplib" module does sending by calling sock_sendall in "socketmodule.c". That does an OS-level "send", and once everything has been sent, returns.

But an OS-level socket send returns when the data is queued for sending, not when it is delivered.  Only when the socket is closed, and the close status checked, do you know if the data was delivered. There's a final TCP close handshake that occurs when close has been called at both ends, and only when it completes successfully do you know that the data has been delivered.

At the socket level, this is performed by "shutdown" (which closes the connection and returns the proper network status information), or by "close".

Look at sock_close in "socketmodule.c".  Note that it ignores the return status on close, always returns None, and never raises an exception.  As the Linux manual page for "close" says:
"Not checking the return value of close() is a common but nevertheless serious programming error. It is quite possible that errors on a previous write(2) operation are first reported at the final close(). Not checking the return value when closing the file may lead to silent loss of data."

"ftplib", in "storlines" and "storbinary", calls "close" without checking the status or calling "shutdown" first.  So if the other end disconnects after all data has been queued locally but not sent, received and acknowledged, the sender will never know.
msg119652 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-26 21:59
It sounds more like an issue in socket.close(), right?
msg119658 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-10-26 23:18
I don't quite understand the problem. How exactly do you manage to lose data? The ftp server should send a 226 status code to indicate success over the control connection, and presumably it will do so only after receiving the proper TCP shutdown from its operating system. So regardless of any error handling that ftplib or .close() may fail to perform, I doubt that the problem can actually arise.

As for error checking in close(): I think this is a minor issue for this discussion. Usually, close() will succeed *without* sending all data to the remote system. So looking at the close() status will *not* tell you whether all data has been delivered. You would have to use the SO_LINGER socket option for that.

Even with SO_LINGER turned on, it's pretty pointless to check the close status. While we would be able to tell whether the data has arrived at the remote system, we can't know whether the ftp server has actually read them out of the socket, and written them to disk. So we absolutely need the ftp protocol status to determine the outcome of a STOR (say).
msg119669 - (view) Author: John Nagle (nagle) Date: 2010-10-27 01:39
Proper behavior for ftplib when sending is to send all desired data, then call "sock.shutdown(socket.SHUT_RDWR)".  This indicates that no more data will be sent, and blocks until the receiver has acknowledged all their data. 

"socketmodule.c" handles this right.  "shutdown" is called on the socket, and the return value is checked.  If the return value is negative, an error handler is returned.  Compare the handling in "close".  

FTP send is one of the few situations where this matters, because FTP uses the close of the data connection to indicate EOF.
msg119685 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2010-10-27 10:26
> Proper behavior for ftplib when sending is to send all desired data, 
> then call "sock.shutdown(socket.SHUT_RDWR)".  This indicates that no 
> more data will be sent, and blocks until the receiver has acknowledged 
> all their data. 

I'm not sure about this. Such a requirement should be respected by both peers and AFAICR there's no FTP-related RFC which explicitly states that shutdown() should be used.

http://www.rfc-archive.org/getrfc.php?rfc=4217 chapter 12.4 should reflect the same use case we're talking about: a client sending a file (STOR).
In the example only close() is used, which perhaps should lead this discussion to question whether socket's close() method is implemented properly, as your linux man quote suggests.
msg119720 - (view) Author: John Nagle (nagle) Date: 2010-10-27 17:04
Re: "Such a requirement should be respected by both peers and AFAICR there's no FTP-related RFC which explicitly states that shutdown() should be used.".

That's because the FTP spec (RFC 959) talks about TCP in reference to the TCP specification, not the Berkeley Sockets implementation.  "shutdown" is a Berkeley Sockets interface feature, and isn't formally standardized.  Which is why it isn't mentioned in RFCs. 

The issue here is that an FTP data connection send is one of the few places where 1) the sender is indicating an EOF by closing the connection, and 2) the sender cares about complete delivery to the receiver.  (HTTP servers don't care if the client disconnects before transfer is complete.)

The TCP spec (the original RFC 793) says "A TCP will reliably deliver all buffers SENT before the connection was CLOSED so a user who expects no data in return need only wait to hear the connection was CLOSED successfully to know that all his data was received at the destination TCP."  That's the way it was supposed to work when the FTP spec was written.  In other words, "close" is supposed to wait for completion of the TCP close handshake.

Today, it's common to close connections to abandon them, in which case there's no need to wait for the close handshake to complete.  This resulted in a split at the socket level - there's "close", "SO_LINGER" to affect the wait for the close handshake, and an optional timeout. Then there's "shutdown" for when you explicitly care about what's happening during close.  

Python's socket implementation is written with "close" as a "don't care" operation.  Since many Python closes happen implicitly, when a reference count goes to 0, this is appropriate; raising an exception during deallocation generally surprises the user. So it's implicit in the Python interface that, if you're using connection close to signal EOF and care about delivery, you should use "shutdown".

So I'd put a shutdown call in "ftplib" at the end of each data connection send.  That will block at the end of the transfer and raise an exception (I think that's the way the socket library is coded) if the close handshake doesn't finish cleanly.
msg119788 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-10-28 14:14
> Proper behavior for ftplib when sending is to send all desired data,
> then call "sock.shutdown(socket.SHUT_RDWR)".  This indicates that no
> more data will be sent, and blocks until the receiver has
> acknowledged all their data.

I think you misunderstand. Calling shutdown does *not* block
until the receiver has acknowledged all data. It just put a
FIN packet into the send queue.

> FTP send is one of the few situations where this matters, because FTP
> uses the close of the data connection to indicate EOF.

Not only. It also uses the server response on the control connection
to indicate that all data has been received. Relying on successful
shutdown is both insufficient and unnecessary.
msg406207 - (view) Author: mike mcleod (mikecmcleod) * Date: 2021-11-12 12:43
I would like to help on this issue. I understand the arguments here but it has been a lone time since this was raised and there does not seem to be any further issues discussed or support for this issue.
msg409157 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-12-24 20:52
[pasting in my investigation that I replied with on a mailing list:]

The possible problem does appear real but it likely not frequently observed and is something most people reading the code wouldn't see as it's rather esoteric:

https://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable
via https://stackoverflow.com/questions/8874021/close-socket-directly-after-send-unsafe describes the scenario.

So does this apply in Python?

Apparently.  Our socket module close() is simply calling close as expected.
 https://github.com/python/cpython/blob/main/Modules/socketmodule.c#L3132 ->
 https://github.com/python/cpython/blob/main/Modules/socketmodule.c#L443

Our ftplib code that implicitly calls close on the data connection when exiting the
 `        with self.transfercmd(cmd, rest) as conn: ` context manager
https://github.com/python/cpython/blob/main/Lib/ftplib.py#L498`

Triggers a close() without a prior shutdown(socket.SHUT_RDWR) + read() -> EOF or timeout dance.

In most protocols this isn't a big deal. ftp being so old as to rely solely on the TCP connection itself is the outlier.

How often does this happen in practice?  I don't know.  I haven't heard of it happening, but how many people honestly use ftplib to send data between operating systems where a socket close triggering a TCP RST rather than FIN on the sender might be more likely in the past 20 years?  I suspect not many.  

The code can be improved to prevent it.  But I don't believe it'll be feasible to construct a real world not-mock-filled regression test that would fail without it.

Potential solution:
  Do the shutdown+recv EOF dance as the final step inside of the storbinary and storlines `with self.transfercmd as conn` blocks.

Technically speaking socket.socket.shutdown() is conditionally compiled but I don't know if any platforms lacking the socket shutdown API exist anymore (my guess is that conditional compilation for shutdown is a leftover from the 1990s). If so, a "if hasattr(conn, 'shutdown'):" conditional before the added logic would suffice.

Looking at various ftp client source code (netkit ftp & netbsd ftp - both bsd derivatives) as well as a modern gonuts golang one I do not see them explicitly doing the shutdown dance.  (I didn't dive in to figure out if it is hidden within their flclose() or conn.Close() implementations as that'd be surprising)
msg410644 - (view) Author: mike mcleod (mikecmcleod) * Date: 2022-01-15 12:47
Working.. should be able to create pull request soon. Note part of suggestions include using SIOCOUTQ, but this does not have an equivalent for windows. And as Murphy's law goes this is likely to be where the problem is!
History
Date User Action Args
2022-04-11 14:57:07adminsetgithub: 54411
2022-01-21 13:23:44mikecmcleodsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request28934
2022-01-15 12:47:23mikecmcleodsetmessages: + msg410644
2021-12-24 20:53:29gregory.p.smithsetstage: test needed -> needs patch
2021-12-24 20:53:20gregory.p.smithsetversions: + Python 3.9, Python 3.10, Python 3.11, - Python 3.1, Python 2.7, Python 3.2
2021-12-24 20:52:45gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg409157
2021-11-19 17:05:35ethan.furmansetnosy: + ethan.furman
2021-11-12 12:43:44mikecmcleodsetnosy: + mikecmcleod
messages: + msg406207
2010-10-28 14:14:32loewissetmessages: + msg119788
2010-10-27 17:04:14naglesetmessages: + msg119720
2010-10-27 10:26:27giampaolo.rodolasetmessages: + msg119685
2010-10-27 01:39:00naglesetmessages: + msg119669
2010-10-26 23:18:09loewissetmessages: + msg119658
2010-10-26 21:59:15pitrousetnosy: + exarkun, loewis, pitrou
messages: + msg119652
2010-10-26 20:36:04r.david.murraysetnosy: + giampaolo.rodola
stage: test needed

versions: - Python 2.6, Python 2.5, Python 3.3
2010-10-26 19:04:48naglecreate