msg159728 - (view) |
Author: Andrew McNabb (amcnabb) |
Date: 2012-04-30 19:14 |
When a os.makedirs is used under an autofs directory, it crashes. For example, on my machine, `os.makedirs('/net/prodigy/tmp')` crashes with the following traceback:
Traceback (most recent call last):
...
File "/usr/lib64/python2.7/os.py", line 157, in makedirs
mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/net/prodigy/tmp'
In this case, '/net' is an autofs directory that automatically mounts the "prodigy" directory by connecting to a host called "prodigy" using NFS. The problem seems to be related to the fact that the "/net/prodigy" directory does not actually exist until it is first accessed.
I tried running `mkdir -p /net/prodigy/tmp`, and it succeeds even though the "/net/prodigy" directory did not exist before the "mkdir" command was run.
I'm not sure exactly how `mkdir -p` is implemented, but one potential workaround for Python's makedirs would be to add the following at the top of the function:
os.stat(name)
This stat call really only needs to be run the first time makedirs is called (it does not need to be used for each recursive call).
|
msg159730 - (view) |
Author: Hynek Schlawack (hynek) *  |
Date: 2012-04-30 19:32 |
As makedirs in 3.x doesn’t handle EPERM and is otherwise the same, I presume the error is there as well.
I also presume, that after the failed makedirs(), the directory is mounted? I'd just handle the error just we handle EEXIST in 3.x now.
|
msg159977 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2012-05-05 01:45 |
Hynek: it doesn't seem like that would work, since legitimate EPERM errors are much more likely.
|
msg159980 - (view) |
Author: Hynek Schlawack (hynek) *  |
Date: 2012-05-05 08:23 |
David: What do you mean? I'd just catch EPERM and look whether the directory "appeared" in the meantime. If it exists, continue. If not: re-raise. Am I missing something? I may have a look at GNU mkdir if it doesn't help.
|
msg159983 - (view) |
Author: Charles-François Natali (neologix) *  |
Date: 2012-05-05 10:05 |
To me, this doesn't look like a os.makedirs() bug, but rather an autofs one (didn't know it was still in use).
I don't like the idea of adding such a kludge (i.e. catching EPERM and checking whether the directory appeared in between).
Could you please provide an strace output of both "os.makedirs()" and "mkdir -p"?
Thanks.
|
msg159988 - (view) |
Author: Hynek Schlawack (hynek) *  |
Date: 2012-05-05 11:37 |
Charles, I don't think you can blame autofs here. The problem at hand is that makedirs() never checks whether the directory exists (that would trigger the mount too I presume). Instead, it tries a mkdir and looks if it gets an EEXIST.
If you try that approach in this case where /net is non-writable and /net/prodigy appears only on demand, it fails with an EPERM instead.
Have a look at http://hg.python.org/cpython/file/8215aaccc9dd/Lib/os.py#l136 especially line 151.
|
msg159990 - (view) |
Author: Charles-François Natali (neologix) *  |
Date: 2012-05-05 13:27 |
> Charles, I don't think you can blame autofs here. The problem at hand is that makedirs() never checks whether the directory exists (that would trigger the mount too I presume).
Yes, it does. Have a look at line 148:
"""
if head and tail and not path.exists(head):
"""
> Instead, it tries a mkdir and looks if it gets an EEXIST.
Actually, EEXIST is just caught to cope with race conditions (i.e. the
directory got created in between, TOCTTOU race).
> If you try that approach in this case where /net is non-writable and /net/prodigy appears only on demand, it fails with an EPERM instead.
Actually, no.
makedirs() does a recursive depth-first traversal:
makedirs('/net/prodigy/foo') will actually do something like:
"""
stat('/net/prodigy/foo') == ENOENT
stat('/net/prodigy') == ENONENT
mkdir('/net/prodigy') == EPERM
"""
The NFS mount should appear upon the first - or second - stat() call,
before mkdir().
But I'd like to be sure about that, that's why I think an strace()
output would be useful ;-)
|
msg159991 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2012-05-05 13:30 |
Hynek: you said "just like EEXIST", which doesn't check to see if the directory exists before continuing, thus my confusion.
If mkdir -p does a stat first, then changing the makedirs algorithm to match is probably not a bad idea for OS compatibility reasons, with autofs just being an example of why.
|
msg159993 - (view) |
Author: Hynek Schlawack (hynek) *  |
Date: 2012-05-05 14:08 |
>> Charles, I don't think you can blame autofs here. The problem at hand is that makedirs() never checks whether the directory exists (that would trigger the mount too I presume).
>
> Yes, it does. Have a look at line 148:
> """
> if head and tail and not path.exists(head):
> """
Now that's embarrassing. I take everything back and claim the opposite.
>> If you try that approach in this case where /net is non-writable and /net/prodigy appears only on demand, it fails with an EPERM instead.
>
> Actually, no.
> makedirs() does a recursive depth-first traversal:
> makedirs('/net/prodigy/foo') will actually do something like:
> """
> stat('/net/prodigy/foo') == ENOENT
> stat('/net/prodigy') == ENONENT
> mkdir('/net/prodigy') == EPERM
> """
> The NFS mount should appear upon the first - or second - stat() call,
> before mkdir().
>
> But I'd like to be sure about that, that's why I think an strace()
> output would be useful ;-)
I glanced over coreutil's mkdir code and it seems that it changes the directory while creating the directories. I suspect it's related to that. At least it does no fstab parsing. ;) The code is wayyy too intricate to be grokable at glancing though. mkdir -p seems to be quite a can of worms.
Andrew, please send straces. :)
|
msg160850 - (view) |
Author: Hynek Schlawack (hynek) *  |
Date: 2012-05-16 12:40 |
Andrew, are you still with us?
|
msg160892 - (view) |
Author: Andrew McNabb (amcnabb) |
Date: 2012-05-16 17:31 |
> Andrew, are you still with us?
I'm here, but it's been a busy few weeks. I'll see if I can spend some time on this today.
|
msg160899 - (view) |
Author: Andrew McNabb (amcnabb) |
Date: 2012-05-16 17:59 |
Some interesting information. If I do `os.mkdir('/net/prodigy/tmp')`, it gives "OSError: [Errno 13] Permission denied: '/net/prodigy/tmp'". However, if I instead do `os.mkdir('/net/prodigy/tmp/hi')`, it succeeds. (Note that I'm being careful to start from a fresh unmounted state in both cases).
I will post straces in a few minutes.
|
msg160920 - (view) |
Author: Hynek Schlawack (hynek) *  |
Date: 2012-05-16 19:21 |
I guess this is the magic in mkdir -p:
mkdir("expert", 0755) = -1 EACCES (Permission denied)
chdir("expert") = 0
mkdir("tmp", 0755) = -1 EEXIST (File exists)
I'm not sure how I feel about that. I think we have more or less a policy in os & shutil not to change directories unless really necessary (like make_archive).
|
msg160979 - (view) |
Author: Charles-François Natali (neologix) *  |
Date: 2012-05-17 16:35 |
> I guess this is the magic in mkdir -p:
>
> mkdir("expert", 0755) = -1 EACCES (Permission denied)
> chdir("expert") = 0
> mkdir("tmp", 0755) = -1 EEXIST (File exists)
>
> I'm not sure how I feel about that. I think we have more or less a policy in os & shutil not to change directories unless really necessary (like make_archive).
Calling chdir() should be avoided, as it is not thread safe.
Actually, the real problem is that stat() alone doesn't trigger
automount, to avoid mount storms e.g. when using find, see:
https://bugzilla.redhat.com/show_bug.cgi?id=497830
http://lkml.indiana.edu/hypermail/linux/kernel/1109.1/00100.html
Since calling chdir() is not an option, if we want to fix this, I
guess the way to go would be to open() the directory, and not use
path.exists().
But I find really uncomfortable with adding such a kludge, especially
since this problem is probably present in many different places
throughout the stdlib.
Furthermore, it seems that kernel from 2.6.38 do trigger automount
upon stat() (see
http://lkml.indiana.edu/hypermail/linux/kernel/1109.1/00210.html), so
I'd suggest to drop this as "won't fix".
|
msg160995 - (view) |
Author: Hynek Schlawack (hynek) *  |
Date: 2012-05-17 18:24 |
I agree, closing.
Thank you for your valuable assistance Andrew!
|
msg160997 - (view) |
Author: Andrew McNabb (amcnabb) |
Date: 2012-05-17 18:31 |
This isn't fixed. All of the examples I've given were with a 3.3.0 kernel. Doing a stat would be a fix.
|
msg161003 - (view) |
Author: Hynek Schlawack (hynek) *  |
Date: 2012-05-17 18:59 |
If I read it correctly, we _do_ a stat:
stat("/net/prodigy", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
mkdir("/net/prodigy/tmp", 0777) = -1 EACCES (Permission denied)
ISTM that this is a different problem altogether, or am I missing something?
If mkdir'ing of deeper directories works, it sounds like a race condition to me.
|
msg161006 - (view) |
Author: Hynek Schlawack (hynek) *  |
Date: 2012-05-17 19:07 |
My suspicion re the race condition is that a chdir waits for autofs to mount but mkdir doesn't. You could check that yourself if a chdir/sleep would help.
|
msg161007 - (view) |
Author: Andrew McNabb (amcnabb) |
Date: 2012-05-17 19:22 |
Hmm. Maybe there's a difference between doing stat('/net/prodigy') vs. stat('/net/prodigy/tmp'). Just a guess, but maybe the former can succeed before the mount completes, but the latter has to wait for the mount to complete.
|
msg161008 - (view) |
Author: Andrew McNabb (amcnabb) |
Date: 2012-05-17 19:24 |
By the way, if my hunch about the difference in stat of '/net/prodigy' vs. '/net/prodigy/tmp' is correct, then this would explain why makedirs on deeper directories work. Specifically, one of the shallower stat calls would force the mount to complete before proceeding to the next deeper level.
|
msg161018 - (view) |
Author: Charles-François Natali (neologix) *  |
Date: 2012-05-17 21:34 |
Yes, creating the directories in a bottom-up way (i.e. '/net', '/net/prodigy', '/net/prodigy/foo') could maybe avoid this problem.
But this is definietely an autofs bug, and there are probably many other places where such code might break.
|
msg161024 - (view) |
Author: Hynek Schlawack (hynek) *  |
Date: 2012-05-17 22:52 |
I sense a very easy fix: just do a stat() on the final path before we do anything (and quickly return if present). Moving current makedirs into a private _makedirs that gets called if the stat fails.
I'm still feeling uneasy about that though. It _is_ a kludge.
|
msg161028 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2012-05-18 01:19 |
I agree that the proposed solutions are hacks, and still propose to close this as "won't fix".
I think this is a bug in Linux. mkdir("/net/prodigy/tmp") ought to trigger the mount, then fail with EEXIST.
|
msg161038 - (view) |
Author: Charles-François Natali (neologix) *  |
Date: 2012-05-18 06:46 |
Alright, closing for good then.
Andrew, if you want to get this fixed, you should report this to the autofs folks, because it's definitely not a Python bug.
|
msg161055 - (view) |
Author: Andrew McNabb (amcnabb) |
Date: 2012-05-18 15:33 |
I see no evidence that this is a bug in Linux, and I think it's ridiculous to close it when a trivial one-line fix is available. I won't reopen it because it's obvious no one wants to address this. :(
|
msg161057 - (view) |
Author: Andrew McNabb (amcnabb) |
Date: 2012-05-18 15:45 |
I posted a bug report with the kernel here:
https://bugzilla.kernel.org/show_bug.cgi?id=43262
|
msg161116 - (view) |
Author: Charles-François Natali (neologix) *  |
Date: 2012-05-19 13:57 |
> I see no evidence that this is a bug in Linux,
"""
stat("/net/prodigy", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
mkdir("/net/prodigy/tmp", 0777) = -1 EACCES (Permission denied)
"""
As you can see, a stat() is already done on /net/prodigy.
Event if it wasn't done, calling mkdir() ought to trigger the mount.
So this is *definitely* as bug in autofs.
> and I think it's ridiculous to close it when a trivial one-line fix is available.
Which one-line fix do you propose?
> I won't reopen it because it's obvious no one wants to address this. :(
It's not that we don't want to address this, but rather that we want
to avoid introducing hacks to work around OS bugs.
|
msg161118 - (view) |
Author: Hynek Schlawack (hynek) *  |
Date: 2012-05-19 14:11 |
> """
> stat("/net/prodigy", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> mkdir("/net/prodigy/tmp", 0777) = -1 EACCES (Permission denied)
> """
>
> As you can see, a stat() is already done on /net/prodigy.
To be fair, that shouldn’t trigger a mount. Otherwise a `ls -l` on /net
would mount all volumes.
> Event if it wasn't done, calling mkdir() ought to trigger the mount.
I’m not sure if/where the behavior is defined – let’s what the Linux
people say.
|
msg161120 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2012-05-19 14:17 |
>> """
>> stat("/net/prodigy", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
>> mkdir("/net/prodigy/tmp", 0777) = -1 EACCES (Permission denied)
>> """
>>
>> As you can see, a stat() is already done on /net/prodigy.
>
> To be fair, that shouldn’t trigger a mount. Otherwise a `ls -l` on /net
> would mount all volumes.
Not sure what "that" is: my view is that mkdir should trigger the mount,
/net/prodigy is already there and available. ls -l doesn't invoke mkdir(2),
so you wouldn't get a mount storm when it is mkdir that triggers the mount.
|
msg161121 - (view) |
Author: Hynek Schlawack (hynek) *  |
Date: 2012-05-19 14:20 |
>>> """
>>> stat("/net/prodigy", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
>>> mkdir("/net/prodigy/tmp", 0777) = -1 EACCES (Permission denied)
>>> """
>>>
>>> As you can see, a stat() is already done on /net/prodigy.
>>
>> To be fair, that shouldn’t trigger a mount. Otherwise a `ls -l` on /net
>> would mount all volumes.
>
> Not sure what "that" is: my view is that mkdir should trigger the mount,
> /net/prodigy is already there and available. ls -l doesn't invoke mkdir(2),
> so you wouldn't get a mount storm when it is mkdir that triggers the mount.
Sure, I was refering (as I hoped that my quoting would indicate) to the
stat on /net/prodigy, not the mkdir. I commented on the mkdir in the
next paragraph.
|
msg161280 - (view) |
Author: Andrew McNabb (amcnabb) |
Date: 2012-05-21 15:35 |
> Which one-line fix do you propose?
Doing a stat("/net/prodigy/tmp") before mkdir("/net/prodigy/tmp") is an extremely simple workaround.
Of course, I would love to see clear documentation of how the kernel is defined to behave in this situation. It would certainly be intuitive for mkdir("/net/prodigy/tmp") to force a mount of "/net/prodigy", but defined behavior isn't always intuitive. :)
|
msg161284 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2012-05-21 17:48 |
> Of course, I would love to see clear documentation of how the kernel
> is defined to behave in this situation. It would certainly be
> intuitive for mkdir("/net/prodigy/tmp") to force a mount of
> "/net/prodigy", but defined behavior isn't always intuitive. :)
Maybe I'm confused, but the presence of "/net/prodigy" is *not*
the issue here, and what gets mounted is *not* "/net/prodigy",
but "/net/prodigy/tmp" (do "mount" to confirm or dispute).
|
msg161285 - (view) |
Author: Andrew McNabb (amcnabb) |
Date: 2012-05-21 17:58 |
> Maybe I'm confused, but the presence of "/net/prodigy" is *not*
> the issue here, and what gets mounted is *not* "/net/prodigy",
> but "/net/prodigy/tmp" (do "mount" to confirm or dispute).
No, /net/prodigy is the mountpoint in this case:
amcnabb@sage:~ :) mount |grep prodigy
prodigy:/ on /net/prodigy type nfs4 ([redacted])
|
msg163997 - (view) |
Author: Charles-François Natali (neologix) *  |
Date: 2012-06-25 17:37 |
I still don't like the idea of adding such a kludge to work around OS
bugs, so I'd suggest closing as won't fix.
|
msg163998 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-06-25 17:46 |
+1 for closing.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:29 | admin | set | github: 58907 |
2012-06-25 17:46:33 | pitrou | set | nosy:
+ pitrou messages:
+ msg163998
|
2012-06-25 17:37:10 | neologix | set | messages:
+ msg163997 |
2012-05-21 17:58:18 | amcnabb | set | messages:
+ msg161285 |
2012-05-21 17:48:24 | loewis | set | messages:
+ msg161284 |
2012-05-21 15:35:12 | amcnabb | set | messages:
+ msg161280 |
2012-05-19 14:20:32 | hynek | set | messages:
+ msg161121 |
2012-05-19 14:17:19 | loewis | set | messages:
+ msg161120 |
2012-05-19 14:11:27 | hynek | set | messages:
+ msg161118 |
2012-05-19 13:57:34 | neologix | set | messages:
+ msg161116 |
2012-05-18 15:45:34 | amcnabb | set | messages:
+ msg161057 |
2012-05-18 15:33:40 | amcnabb | set | messages:
+ msg161055 |
2012-05-18 06:46:23 | neologix | set | status: open -> closed
messages:
+ msg161038 stage: resolved |
2012-05-18 01:19:30 | loewis | set | nosy:
+ loewis messages:
+ msg161028
|
2012-05-17 22:53:00 | hynek | set | messages:
+ msg161024 |
2012-05-17 21:34:31 | neologix | set | messages:
+ msg161018 |
2012-05-17 19:24:54 | amcnabb | set | messages:
+ msg161008 |
2012-05-17 19:22:47 | amcnabb | set | messages:
+ msg161007 |
2012-05-17 19:07:06 | hynek | set | messages:
+ msg161006 |
2012-05-17 18:59:02 | hynek | set | messages:
+ msg161003 |
2012-05-17 18:31:07 | amcnabb | set | status: closed -> open
messages:
+ msg160997 |
2012-05-17 18:24:40 | hynek | set | status: open -> closed resolution: wont fix messages:
+ msg160995
stage: needs patch -> (no value) |
2012-05-17 16:35:20 | neologix | set | messages:
+ msg160979 |
2012-05-16 19:21:43 | hynek | set | messages:
+ msg160920 |
2012-05-16 18:05:09 | amcnabb | set | files:
+ makedirs.out |
2012-05-16 18:03:07 | amcnabb | set | files:
+ mkdir-p.out |
2012-05-16 17:59:46 | amcnabb | set | messages:
+ msg160899 |
2012-05-16 17:31:47 | amcnabb | set | messages:
+ msg160892 |
2012-05-16 12:40:41 | hynek | set | messages:
+ msg160850 |
2012-05-05 14:08:57 | hynek | set | keywords:
- easy
messages:
+ msg159993 |
2012-05-05 13:30:32 | r.david.murray | set | messages:
+ msg159991 |
2012-05-05 13:27:29 | neologix | set | messages:
+ msg159990 |
2012-05-05 11:37:24 | hynek | set | messages:
+ msg159988 |
2012-05-05 10:05:13 | neologix | set | nosy:
+ neologix messages:
+ msg159983
|
2012-05-05 08:23:50 | hynek | set | messages:
+ msg159980 |
2012-05-05 01:45:53 | r.david.murray | set | nosy:
+ r.david.murray messages:
+ msg159977
|
2012-04-30 20:02:11 | hynek | set | keywords:
+ easy |
2012-04-30 19:32:16 | hynek | set | versions:
+ Python 3.2, Python 3.3 nosy:
+ hynek
messages:
+ msg159730
type: behavior stage: needs patch |
2012-04-30 19:14:17 | amcnabb | create | |