New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change os.fsync() to support physical backing store syncs #56086
Comments
bpo-11277 revealed that Mac OS X fsync() does not always |
Oh, it remembers a long story around ext3/ext4 and write barrier, with a specific problem in Firefox with SQLite. http://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man2/fsync.2.html http://shaver.off.net/diary/2008/05/25/fsyncers-and-curveballs/ http://lwn.net/Articles/283161/ http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2008-September/001356.html http://lists.mindrot.org/pipermail/portawiki-discuss/2005-November/000002.html |
Traditionally the functions in os are thin wrappers around OS functions, this patch changes that for os.fsync. Two nits wrt. the patch itself:
I'm -1 on merging this patch though, as the fsync function on OSX behaves just like the one on Linux, the fnctl option adds a stronger guarantee in that it forces a flush of the buffers on the storage device as well. The manpage for fsync on a linux host says: In case the hard disk has write cache enabled, the data may not really be on permanent storage when fsync/fdatasync return. Adding the F_FULLSYNC option to os.fsync would be fine though (if it isn't already implemented) The linux |
I know that POSIX makes no guarantee regarding durable writes, but IMHO that's definitely wrong, in the sense that when one calls fsync, he expects the data to be committed to disk and be durable. |
I think so, too.
If a platform offers the opportunity to actually implement the (And the single question on apple is simply what to do otherwise |
See <http://www.kernel.org/doc/man-pages/online/pages/man2/fsync.2.html\> for linux' behavior, in particular: linux doesn't guarantee that data gets writting to the disk when you call fsync, only that the data gets pushed to the storage device. This is the same behavior as fsync on OSX, and OSX also has a second API that provides stronger guarantees. With your patch it is no longer possible to call the C function fsync on OSX, even though it is good enough for a lot of use cases. As os.fcntl already supports F_FULLSYNC I see no good reason to change the implementation of os.fsync on OSX. BTW. The need to use F_FULLSYNC in bpo-11277 might be a bug in OSX itself, have you filed an issue in Apple's tracker? |
A different approach for this issue is to document fcntl.fcntl(fd, fcntl.F_FULLFSYNC) in fsync() documentation. |
Agreed with Ronald. If Apple thinks fsync() shouldn't flush the hardware cache, there's no reason for us to override that. |
Barriers are now enable by default in EXT4, and Theodore Tso has been favourable to that for quite some time now: As for OS-X, this is definitely a bug (I mean, having to call fsync before mmap is a huge bug in itself). |
On Wed, Apr 20, 2011 at 12:16:39PM +0000, Ronald Oussoren wrote:
I don't see it that way for multiple reasons. One of it is that Python states that it "is a beginner language", (Just think of an african kid with a sponsored laptop and possibly Then Python is a high-level language. It's true that there are And then - at least half a dozen of programmers with altogether Apple? No, i'm looking forward to return to a private BSD/Linux |
1 similar comment
On Wed, Apr 20, 2011 at 12:16:39PM +0000, Ronald Oussoren wrote:
I don't see it that way for multiple reasons. One of it is that Python states that it "is a beginner language", (Just think of an african kid with a sponsored laptop and possibly Then Python is a high-level language. It's true that there are And then - at least half a dozen of programmers with altogether Apple? No, i'm looking forward to return to a private BSD/Linux |
8-} On Wed, Apr 20, 2011 at 12:16:39PM +0000, Antoine Pitrou wrote:
Apple thinks (KernelProgramming.pdf, BPFileSystem.pdf):
POSIX says:
It seems someone needs a hand. |
Ronald Oussoren wrote:
To show you that i'm not unteachable i'll attach a patch which (The first time that i use PyArg_xy plus plus - please review. |
I'm convinced. And i've rewritten the patch.
|
P.S.: dito Linux and NetBSD. I've reused preprocessor tests |
11877.2.diff: On Mac OS X, fsync(fd, full_sync=True); fsync(fd) do a full sync twice. You should restore the old file flags at fsync() exit. Or this surprising behaviour should be documented. |
Ok, 11877.3.diff uses either-or. |
I'm -10 on sync_file_range on Linux:
|
Charles-Francois Natali wrote:
I just looked at I've also "search"ed for the called filemap_write_and_wait_range() I will wait before i update the patch though, just in case some |
I just double-checked, and indeed, fsync does flush the disk cache |
I'll attach 11877.4.diff:
|
A couple comments: static PyObject *
posix_fsync(PyObject *self, PyObject *args, PyObject *kwargs)
{
PyObject *retval = NULL;
auto PyObject *fdobj;
auto int full_fsync = 1; Why are you using the "auto" storage class specifier? I know that "explicit is better than implicit", but I really can't see a good reason for using it here (and anywhere, see http://c-faq.com/decl/auto.html). # ifdef __APPLE__
res = fcntl(fd, F_FULLFSYNC);
# endif
# ifdef __NetBSD__
res = fsync_range(fd, FFILESYNC|FDISKSYNC, 0, 0);
# endif Since __APPLE__ and __NetBSD__ are exclusive, you could use something like Do you really need to use goto for such a simple code? |
On Sat, 7 May 2011 14:20:51 +0200, Charles-François Natali wrote:
You're right.
Yup. I'm doing what is happening for real in (x86) assembler.
Yup. Ok, this is more complicated. The reason is that my funs ret
jleave: [maybe only, if large: possibly "predict-false" code blocks] [possible error jumps here We're prowd of that. N(ot)Y(et)D(ead) is actually pretty cool, But of course i can change this (in C) to simply use return, this Thanks for looking at this, by the way. :) |
On Sat, 7 May 2011 14:20:51 +0200, Charles-François Natali wrote:
Yes, you're right, i'll update the patch accordingly. �-- |
11877.5.diff incorporates all changes suggested by I'm dropping the old stuff. And i think this is the final version So, thanks to Ronald, i detected that also NetBSD introduced |
Steffen, I don't understand your comment about "auto". Declaring variables as "auto" is not necessary in C code and not used anywhere else in Python's source code. |
Ronald Oussoren wrote (2011-05-08 10:33+0200):
Well, as long as i can keep my underwear all is fine. |
Steffen, you changed the default to doing a "full sync" in your last patch: while I was favoring that initially, I now agree with Ronald and Antoine, and think that we shouldn't change the default behaviour. The reason being that Apple and NetBSD folks should know what they're doing, and that there might be some subtle side effects (see for example my comment on sync_file_range on Linux). Also, given how many bugs you seem to encouter on OS-X, it's probably better to stick to a syscall known to be safe instead of silently replacing it by a maybe-not-so-tested syscall. Finally, depending on the workload, it could have a significant performance impact. People requiring write durability will probably manage to find this full_sync parameter. |
I don't agree with you and i don't believe it is implemented like
Changed.
Changed all through.
This sounds logical. I've changed the doc in os.rst so that it
:-) |
Ouch, ouch, ouch!! Dropping .5 and .6 - and sorry for the noise. |
Just adding more notes on that by reactivating one of haypo's
(Don't ask me why i'm adding this note though. |
Calling fsync on a file descriptor referring to a tty doesn't make much sense.
On Linux, this fails with EINVAL:
$ python -c 'import os; os.fsync(1)'
Traceback (most recent call last):
File "<string>", line 1, in <module>
OSError: [Errno 22] Invalid argument So if the full sync fails on ttys, it shouldn't be a problem: since By the way, it's "appropriate", not "approbiate". You made the same |
[.]
Sorry, i didn't know that. Mac OS X (2.5 and 2.6 Apple shipped): 21:43 ~/tmp $ python2.5 -c 'import os; os.fsync(1)'; echo $? |
Strong stuff. F_FULLFSYNC Does the same thing as fsync(2) then asks the drive to and i thought
But who knows if that fcntl will fail on some non-noted fsys?
8~I That was not a typo. Thanks. |
Why?
Why not, since that's what the kernel returns? |
Oh yes (first replaces os.fsync() with an in-python wrapper): 18:12 ~/tmp $ ll mail real 0m4.638s real 0m1.228s (I'm using the first one.) |
(I'm not sure Rietveld sent the message so I post it here, sorry in case of duplicate). Steffen, I've made a quick review of your patch, in case you're interested. Concerning your benchmark: # time find /lib -type f -exec cat {} \; > /dev/null # time find /lib -type f -exec cat {} \; > /dev/null # echo 3 > /proc/sys/vm/drop_caches # time find /lib -type f -exec cat {} \; > /dev/null |
Thank you, thank you, thank you. Well and about dropping the fsync() in case the fcntl() fails with NOTES But Apple is *soooo*speeeeciaaaal* again. Happy birthday. |
I've dropped wet-her!
I was able to add two tests as an extension to what is yet tested
Yes..
..but i've used copy+paste here.
Ok, i've renamed full_fsync to full_sync. |
I'm not sure I'm always understanding your messages well (I'm not a |
I don't like the API because it gives a different behaviour depending on the OS and I don't see how to check that the function does really a full sync. I would prefer a new option os.fullsync() function which is like your os.fsync(fd, full_sync=False), except that the function doesn't exist if the OS doesn't implement it. |
Excusing myself seems to be the only "probates Mittel". |
See also issue bpo-12102. |
diff --git a/Doc/library/os.rst b/Doc/library/os.rst +.. function:: fullfsync(fd)
diff --git a/Lib/test/test_os.py b/Lib/test/test_os.py class TestInvalidFD(unittest.TestCase):
singles = ["fchdir", "dup", "fdopen", "fdatasync", "fstat",
- "fstatvfs", "fsync", "tcgetpgrp", "ttyname"]
+ "fstatvfs", "fsync", "fullfsync", "tcgetpgrp", "ttyname"]
#singles.append("close")
- #We omit close because it doesn'r raise an exception on some platforms
+ # We omit close because it doesn't raise an exception on some platforms
def get_single(f):
def helper(self):
- if hasattr(os, f):
+ if hasattr(os, f):
self.check(getattr(os, f))
return helper
for f in singles:
diff --git a/Modules/posixmodule.c b/Modules/posixmodule.c
--- a/Modules/posixmodule.c
+++ b/Modules/posixmodule.c
@@ -174,6 +174,11 @@
#endif /* ! __IBMC__ */ #ifndef _MSC_VER
+ /* os.fullfsync()? */
+# if (defined HAVE_FSYNC && ((defined __APPLE__ && defined F_FULLFSYNC) || \
+ (defined __NetBSD__ && defined FDISKSYNC)))
+# define PROVIDE_FULLFSYNC
+# endif #if defined(__sgi)&&_COMPILER_VERSION>=700
/* declare ctermid_r if compiling with MIPSPro 7.x in ANSI C mode
@@ -2129,6 +2134,41 @@
{
return posix_fildes(fdobj, fsync);
}
+
+# ifdef PROVIDE_FULLFSYNC
+PyDoc_STRVAR(fullfsync__doc__,
+"fullfsync(fd)\n\n"
+"force write of file buffers to disk, and the flush of disk caches\n"
+"of the file given by file descriptor fd.");
+
+static PyObject *
+fullfsync(PyObject *self, PyObject *fdobj)
+{
+ /* See issue 11877 discussion */
+ int res, fd = PyObject_AsFileDescriptor(fdobj);
+ if (fd < 0)
+ return NULL;
+ if (!_PyVerify_fd(fd))
+ return posix_error();
+
+ Py_BEGIN_ALLOW_THREADS
+# if defined __APPLE__
+ /* F_FULLFSYNC is not supported for all types of FDs/FSYSs;
+ * be on the safe side and test for inappropriate ioctl errors */
+ res = fcntl(fd, F_FULLFSYNC);
+ if (res < 0 && errno == ENOTTY)
+ res = fsync(fd);
+# elif defined __NetBSD__
+ res = fsync_range(fd, FFILESYNC | FDISKSYNC, 0, 0);
+# endif
+ Py_END_ALLOW_THREADS
+
+ if (res < 0)
+ return posix_error();
+ Py_INCREF(Py_None);
+ return Py_None;
+}
+# endif /* PROVIDE_FULLFSYNC */
#endif /* HAVE_FSYNC */
#ifdef HAVE_SYNC
@@ -9473,6 +9513,9 @@
#endif
#ifdef HAVE_FSYNC
{"fsync", posix_fsync, METH_O, posix_fsync__doc__},
+# ifdef PROVIDE_FULLFSYNC
+ {"fullfsync", fullfsync, METH_O, fullfsync__doc__},
+# endif
#endif
#ifdef HAVE_SYNC
{"sync", posix_sync, METH_NOARGS, posix_sync__doc__}, |
(This was an attachment to an empty mail message.) |
Trying to revive this issue, here's a comment I left on Rietveld: """
I think I wasn't clear enough, so let me rephrase: the call to fcntl is guarded Anyway, I think this patch can be useful (see for example issue bpo-8604 as a typical use case). I personally don't care since I use Linux, but OS-X and *BSD folks might find some interest in this. |
Here is something unsorted and loose:
--Steffen |
I don't understand. If the syscall supposed to flush the disk's buffer
Are you willing to update your patch accordingly?
I agree.
Yes, but since it's a fairly reasonable and useful feature, I think it
Well, actually, some hard disks lie about this too :-) |
I'm with you theoretically - of course errors should be propagated -- >8 --
?0%0[steffen@sherwood tmp]$ cat t.c
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main(void) {
int r = fcntl(2, F_FULLFSYNC);
fprintf(stderr, "1. %d: %d, %s\n", r, errno, strerror(errno));
errno = 0;
r = fsync(2);
fprintf(stderr, "2. %d: %d, %s\n", r, errno, strerror(errno));
return 0;
}
?0%0[steffen@sherwood tmp]$ gcc -o t t.c && ./t
1. -1: 25, Inappropriate ioctl for device
2. 0: 0, Unknown error: 0
?0%0[steffen@sherwood tmp]$ grep -F 25 /usr/include/sys/errno.h
#define ENOTTY 25 /* Inappropriate ioctl for device */
-- >8 -- So in fact the patches do what is necessary to make the changed
Both patches still apply onto the tip of friday noon: Again: i personally would favour os.fsync(fd, fullsync=True), because So what do you mean? Shall i rewrite 11877-standalone.1.diff to
Yeah. But hey: --Steffen |
I'm a vain rooster! I've used fullfsync() in 11877-standalone.1.diff! 11877.fullsync-1.diff uses fullsync() and will instead always be I really think this is the cleanest solution, because like this --Steffen |
What is the status of this issue? This issue is interesting in the implementation of bpo-8604 (add shutil.atomic_write()). |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: