Issue513572
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2002-02-06 02:07 by herron, last changed 2022-04-10 16:04 by admin. This issue is now closed.
Messages (17) | |||
---|---|---|---|
msg9126 - (view) | Author: Gary Herron (herron) | Date: 2002-02-06 02:07 | |
It's been documented in earlier version of Python on windows that os.path.isdir returns true on a UNC directory only if there was an extra backslash at the end of the argument. In Python2.2 (at least on windows 2000) it appears that *TWO* extra backslashes are needed. Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> >>> import os >>> os.path.isdir('\\\\trainer\\island') 0 >>> os.path.isdir('\\\\trainer\\island\\') 0 >>> os.path.isdir('\\\\trainer\\island\\\\') 1 >>> In a perfect world, the first call should return 1, but never has. In older versions of python, the second returned 1, but no longer. In limited tests, appending 2 or more backslashes to the end of any pathname returns the correct answer in both isfile and isdir. |
|||
msg9127 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2002-02-08 22:05 | |
Logged In: YES user_id=6380 Tim, I hate to do this to you, but you're the only person I trust with researching this. (My laptop is currently off the net again. :-( ) |
|||
msg9128 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2002-02-08 23:17 | |
Logged In: YES user_id=31435 Here's the implementation of Windows isdir(): def isdir(path): . """Test whether a path is a directory""" . try: . st = os.stat(path) . except os.error: . return 0 . return stat.S_ISDIR(st[stat.ST_MODE]) That is, we return whatever Microsoft's stat() tells us, and our code is the same in 2.2 as in 2.1. I don't have Win2K here, and my Win98 box isn't on a Windows network so I can't even try real UNC paths here. Reassigning to MarkH in case he can do better on either count. |
|||
msg9129 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2002-02-08 23:33 | |
Logged In: YES user_id=31435 BTW, it occurs to me that this *may* be a consequence of whatever was done in 2.2 to encode/decode filename strings for system calls on Windows. I didn't follow that, and Mark may be the only one who fully understands the details. |
|||
msg9130 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2002-02-10 18:57 | |
Logged In: YES user_id=31435 Gary, exactly what do you mean by "older versions of Python"? That is, specifically which versions? The Microsoft stat() function is extremely picky about trailing (back)slashes. For example, if you have a directory c:/python, and pass "c:/python/" to the MS stat (), it claims no such thing exists. This isn't documented by MS, but that's how it works: a trailing (back)slash is required if and only if the path passed in "is a root". So MS stat() doesn't understand "/python/", and doesn't understand "d:" either. The former doesn't tolerate a (back)slash, while the latter requires one. This is impossible for people to keep straight, so after 1.5.2 Python started removing (back)slashes on its own to make MS stat() happy. The code currently leaves a trailing (back)slash alone if and only if one exists, and in addition of these obtains: 1) The (back)slash is the only character in the path. or 2) The path has 3 characters, and the middle one is a colon. UNC roots don't fit either of those, so do get one (back) slash chopped off. However, just as for any other roots, the MS stat() refuses to recognize them as valid unless they do have a trailing (back)slash. Indeed, the last time I applied a contributed patch to this code, I added a /* XXX UNC root drives should also be exempted? */ comment there. However, this explanation doesn't make sense unless by "older versions of Python" you mean nothing more recent than 1.5.2. If I'm understanding the source of the problem, it should exist in all Pythons after 1.5.2. So if you don't see the same problem in 1.6, 2.0 or 2.1, I'm on the wrong track. |
|||
msg9131 - (view) | Author: Gary Herron (herron) | Date: 2002-02-11 08:03 | |
Logged In: YES user_id=395736 Sorry, but I don't have much of an idea which versions I was refering to. I picked up the idea of an extra backslashes in a faq from a web site, the search for which I can't seem to reproduce. It claimed one backslash was enough, but did not specify a python version. It *might* have been old enough to be pre 1.5.2. The two versions I can test are 1.5.1 (where one backslash is enough) and 2.2 (where two are required). This seems to me to support (or at least not contradict) Tim's hypothesis. Gary |
|||
msg9132 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2002-02-11 08:28 | |
Logged In: YES user_id=31435 Mark, what do you think about a different approach here? 1. Leave the string alone and *try* stat. If it succeeds, great, we're done. 2. Else if the string doesn't have a trailing (back)slash, append one and try again. Win or lose, that's the end. 3. Else the string does have a trailing (back)slash. If the string has more than one character, strip a trailing (back)slash and try again. Win or lose, that's the end. 4. Else the string is a single (back)slash, yet stat() failed. This shouldn't be possible. It doubles the number of stats in cases where the file path doesn't correspond to anything that exists. OTOH, MS's (back)slash rules are undocumented and incomprehensible (read their implementation of stat() for the whole truth -- we're not out-thinking lots of it now, and the gimmick added after 1.5.2 to out-think part of it is at least breaking Gary's thoroughly sensible use). |
|||
msg9133 - (view) | Author: Gordon B. McMillan (gmcm) | Date: 2002-03-07 15:31 | |
Logged In: YES user_id=4923 Data point: run on a win2k box, where \\ME is an NT box Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit (Intel)] on win32 >>> os.path.isdir(r"\\ME\E\java") 1 >>> os.path.isdir(r"\\ME\E\java\\") 0 >>> os.path.isdir("\\\\ME\\E\\java\\") 1 >>> os.path.isdir("\\\\ME\\E\\java\\\\") 0 |
|||
msg9134 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2002-03-10 09:03 | |
Logged In: YES user_id=31435 Gordon, none of those are UNC roots -- they follow the rules exactly as stated for non-UNC paths: MS stat() recognizes \\ME\E\java if and only if there's no trailing backslash. That's why your first example succeeds. The complication is that Python removes one trailing backslash "by magic" unless the path "looks like a root", and none of these do. That's why your third example works. Your second and fourth examples fail because you specified two trailing backslashes in those, and Python only removes one of them by magic. An example of "a UNC root" would be \\ME\E. The MS stat() recognizes a root directory if and only if it *does* have a trailing backslash, and Python's magical backslash removal doesn't know UNC roots from a Euro symbol. So the only way to get Python's isdir() (etc) to recognize \\ME\E is to follow it with two backslashes, one because Python strips one away (due to not realizing "it looks like a root"), and another else MS stat() refuses to recognize it. Anyway, I'm unassigning this now, cuz MarkH isn't paying any attentino. If someone wants to write a pile of tedious code to "recognize a UNC root when it sees one", I'd accept the patch. I doubt I'll get it to it myself in this lifetime. |
|||
msg9135 - (view) | Author: Trent Mick (tmick) ![]() |
Date: 2002-04-04 18:08 | |
Logged In: YES user_id=34892 I have struggled with this too. Currently I tend to use this _isdir(). Hopefully this is helpful. def _isdir(dirname): """os.path.isdir() doesn't work for UNC mount points. Fake it. # For an existing mount point # (want: _isdir() == 1) os.path.ismount(r"\\crimper\apps") -> 1 os.path.exists(r"\\crimper\apps") -> 0 os.path.isdir(r"\\crimper\apps") -> 0 os.listdir(r"\\crimper\apps") -> [...contents...] # For a non-existant mount point # (want: _isdir() == 0) os.path.ismount(r"\\crimper\foo") -> 1 os.path.exists(r"\\crimper\foo") -> 0 os.path.isdir(r"\\crimper\foo") -> 0 os.listdir(r"\\crimper\foo") -> WindowsError # For an existing dir under a mount point # (want: _isdir() == 1) os.path.mount(r"\\crimper\apps\Komodo") -> 0 os.path.exists(r"\\crimper\apps\Komodo") -> 1 os.path.isdir(r"\\crimper\apps\Komodo") -> 1 os.listdir(r"\\crimper\apps\Komodo") -> [...contents...] # For a non-existant dir/file under a mount point # (want: _isdir() == 0) os.path.ismount(r"\\crimper\apps\foo") -> 0 os.path.exists(r"\\crimper\apps\foo") -> 0 os.path.isdir(r"\\crimper\apps\foo") -> 0 os.listdir(r"\\crimper\apps\foo") -> [] # as if empty contents # For an existing file under a mount point # (want: _isdir() == 0) os.path.ismount(r"\\crimper\apps\Komodo\exists.txt") -> 0 os.path.exists(r"\\crimper\apps\Komodo\exists.txt") -> 1 os.path.isdir(r"\\crimper\apps\Komodo\exists.txt") -> 0 os.listdir(r"\\crimper\apps\Komodo\exists.txt") -> WindowsError """ if sys.platform[:3] == 'win' and dirname[:2] == r'\\': if os.path.exists(dirname): return os.path.isdir(dirname) try: os.listdir(dirname) except WindowsError: return 0 else: return os.path.ismount(dirname) else: return os.path.isdir(dirname) |
|||
msg9136 - (view) | Author: Mark Hammond (mhammond) * ![]() |
Date: 2002-04-05 00:46 | |
Logged In: YES user_id=14198 Sorry - I missed this bug. It is not that I wasn't paying attention, but rather that SF's Tracker didn't get my attention :( Have I mentioned how much I have SF and love Bugzilla yet? :) I quite like Tim's algorithm. One extra stat in that case is OK IMO. I can't imagine too many speed sensitive bits of code that continuously check for a non-existent directory. Everyone still OK with that? |
|||
msg9137 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2002-04-05 01:51 | |
Logged In: YES user_id=31435 Nice to see you, Mark! If you want to pursue this, the caution I had about my idea, but forgot to write down, is that Python does lots of stats during imports, and especially stats on things that usually don't exist (is it there with a .pyd suffix? a .dll suffix? a .py suffix? a .pyw suffix? a .pyc suffix?). If the idea has a bad effect on startup time, that may kill it; startup time is already a sore point for some. OTOH, on Windows we should really, say, be using FindFirstFile() with a wildcard extension for that purpose anyway. |
|||
msg9138 - (view) | Author: Mark Hammond (mhammond) * ![]() |
Date: 2002-04-17 14:30 | |
Logged In: YES user_id=14198 I have done a little analysis of how we use stat and how it performs by instrumenting posixmodule.c. It seems that Tim's concern about Python starup/import is largely unfounded. While Python does call stat() repeatedly at startup, it does so from C rather than os.stat(). Thus, starting and stopping Python yields the following (with my instrumentation): Success: 9 in 1.47592ms, avg 0.164 Failure: 2 in 0.334504ms, avg 0.1673 (ie, os.stat() is called with a valid file 9 times, and invalid file twice. Average time for stat() is 0.16ms per call.) python -c "import os, string, httplib, urllib" shows the same results (ie, no extra stats for imports) However, this is not the typical case. The Python test suite (which takes ~110 seconds wall time on my PC) yields the following: Success: 383 in 84.3571ms, avg 0.2203 Failure: 1253 in 3805.52ms, avg 3.037 egads - 4 seconds spent in failed stat calls, averaging 3ms each!! Further instrumentation shows that stat() can be very slow on directories with many files. In this case, os.stat() in the %TEMP% directory for tempfiles() occasionally took extremely long. OK - so assuming this tempfile behaviour is also not "typical", I tried the COM test suite: Success: 972 in 303.856ms, avg 0.3126 Failure: 16 in 2.60549ms, avg 0.1628 (also with some extremely long times on files that did exist in a directory with many files) So - all this implies to me that: * stat() can be quite slow in some cases, success or failure * We probably shouldn't make this twice as long in every case that fails! So, I am moving back to trying to outguess the stat() implementation. Looking at it shows that indeed UNC roots are treated specially along with the root directory case already handled by Python (courtesy of Tim). Adding an extra check for a UNC root shouldn't be too hard, and can't possibly be as expensive as an extra stat() :) |
|||
msg9139 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2002-04-18 01:10 | |
Logged In: YES user_id=31435 Sounds good to me! I agree it shouldn't be all that hard to special-case UNC roots too -- what I wonder about is how many other forms of "root" syntax MS will make up out of thin air next year <wink>. |
|||
msg9140 - (view) | Author: Greg Chapman (glchapman) | Date: 2004-04-20 18:21 | |
Logged In: YES user_id=86307 I just ran into this bug. I checked the CVS and it appears that no patch has yet been committed for it. Does a patch exist? Am I correct that the suggested change is essentially: if (IsRootUNCName(path)) EnsureTrailingSlash(path); else if (!IsRootDir(path)) NukeTrailingSlashIfPresent(path); stat(path, st); |
|||
msg9141 - (view) | Author: Greg Chapman (glchapman) | Date: 2004-05-14 18:02 | |
Logged In: YES user_id=86307 I took a stab at fixing this, see: www.python.org/sf/954115 |
|||
msg9142 - (view) | Author: Martin v. Löwis (loewis) * ![]() |
Date: 2004-06-02 10:05 | |
Logged In: YES user_id=21627 This is fixed with Greg's patch. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:04:57 | admin | set | github: 36033 |
2002-02-06 02:07:40 | herron | create |