classification
Title: os.stat() on windows doesn't consider relative symlink
Type: behavior Stage: resolved
Components: Windows Versions: Python 3.3, Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: brian.curtin Nosy List: brian.curtin, eric.smith, georg.brandl, haypo, jason.coombs, loewis, nadeem.vawda, ocean-city, python-dev, santa4nt, tim.golden
Priority: release blocker Keywords: 3.2regression, needs review, patch

Created on 2011-05-16 02:53 by ocean-city, last changed 2011-06-14 15:33 by brian.curtin. This issue is now closed.

Files
File name Uploaded Description Edit
test.diff brian.curtin, 2011-05-24 02:01
patch_v2.patch ocean-city, 2011-05-29 23:17 just expanded from patches_v2.tar.gz
patch_v2.txt ocean-city, 2011-05-29 23:21 just expanded from patches_v2.tar.gz
issue12084.diff brian.curtin, 2011-06-02 23:15
issue12084_v2.diff brian.curtin, 2011-06-07 15:38 review
issue12084_v3.diff brian.curtin, 2011-06-13 16:41 3.2 patch review
smime.p7s jason.coombs, 2011-06-13 21:18
issue12084_XP.diff brian.curtin, 2011-06-13 23:48
quick1.patch ocean-city, 2011-06-14 08:34 review
quick2.patch ocean-city, 2011-06-14 08:34 review
quick3.patch ocean-city, 2011-06-14 08:35 review
Messages (37)
msg136062 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2011-05-16 02:53
Hello. I noticed os.stat() on windows may traverse wrong path on relative symbolic when current working directory != the directory where symbolic link is in. This is because the relative path DeviceIoControl() returns is just passed to win32_xstat without converting to absolute path.

I'm sorry because I implemented this function, and it's hard for me to debug this because I don't have Vista/7. This patch uses GetFinalPathNameByHandle like original code does, plus should handle symlink to system locked file.

Can anyone test and commit this patch?
msg136069 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-05-16 06:02
It's quite a large patch... :)

I now own a new laptop that had Windows 7 preinstalled, so I'll try to get set up VC++ Express, and then test the patch. Still I'm nosying Martin as well.

For the future, it's much easier to just attach both files individually, this saves the need to download and unpack the archive.
msg136109 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-05-16 16:08
Hirokazu contacted me directly with these patches a few days ago but I haven't been able to email him because his host's DNS is apparently down.

The tests in this patch do not end up testing anything, so we'll need to start with a proper test. In the brief look I had the other day, adding an except clause to the outer try/finally block catches something about a file not being available IIRC. I'll take a look this afternoon/evening.
msg136209 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-05-18 07:55
Brian, do you think you'll be able to finish this for 3.2.1?

If we do fix it, we'd need a second rc (not a problem for me).
msg136244 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-05-18 13:40
I'm hoping to. I have time to work on it tonight and tomorrow night US/Chicago time and will keep you posted.
msg136253 - (view) Author: Jason R. Coombs (jason.coombs) * (Python committer) Date: 2011-05-18 16:49
Brian, I'm available to help with this tomorrow evening; let me know if you want to team up on it then.
msg136266 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-05-19 03:00
With the patch applied, the new test fails along with test.test_os.WalkTests.test_traversal and test.test_os.Win32SymlinkTests.test_directory_link.

Overall, I agree that this doesn't work correctly. The patch, which is pretty large, breaks more than it fixes.

I'll see if I can make time during the day tomorrow to look further into the code, and I'll try to coordinate with Jason if possible.


(for anyone else looking at this, remember to run your command prompt as administrator)
msg136711 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-05-24 02:01
Here's standalone patch which should cover this problem. The patch fails right now but succeeds if you apply it back to 652baf23c368 (the changeset before one of several changes around this code). I'll try to find the actual offending checkin and workout the differences to make a fix.
msg136716 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-05-24 02:40
Ok, so it's 893b098929e7 where that test stops working.
msg136750 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-05-24 14:16
Correction for msg136711 -- s/patch/test/g
msg137005 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-05-26 20:24
Ok, so it's actually 0a1baa619171 that broke it, not sure how I came up with the other revision. In any case, it's too hairy to try and piece everything together across the numerous bug fixes, feature adds, and refactorings in this area in order to get back to a working state (plus you'll want to jump out the window if you try, believe me). I'm going to have a go at basically rewriting the os.stat Windows path tonight and tomorrow and see what that does.
msg137103 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-05-28 01:07
It turns out DeviceIoControl/FSCTL_GET_REPARSE_POINT (in win32_read_link) will only work for us as long as the symlink was created with a full path. Starting at the top level of a source checkout, if I create `os.symlink("README", "README.lnk")` and then do `os.stat("..\\README.lnk")` from up a directory (or any other directory), DeviceIoControl can only find out that the symlink was created with "README", so the reparse tag it knows about is "README", which doesn't really help us in figuring out where that file is actually located. Everything is fine if I create the symlink with full paths.

I'm in the middle of refactoring this to work with GetFinalPathNameByHandle. I had thought about a quick-and-dirty solution of modifying os.symlink to convert all paths into fully qualified paths in order to give DeviceIoControl the info it needs for os.stat...but that doesn't help for any previously created links, or for any links created by Microsoft tools such as the "mklink" command line tool (it doesn't set the reparse tag with a fully qualified path either).
msg137127 - (view) Author: Jason R. Coombs (jason.coombs) * (Python committer) Date: 2011-05-28 13:04
To the extent that we can, we should try to support relative symlinks. Absolute symlinks aren't the right thing in some cases, where the symlinks should be movable with their targets. I use relative links extensively.

Is it worth considering changing the current working directory to the same directory as the symlink when calling DeviceIoControl?
msg137458 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-06-01 20:44
I have this working when you stat the symlink from the directory it was created or above...but oddly it does not work when you open a symlink below the directory it exists in.

DeviceIoControl isn't used for reparse tag handling anymore, and I'm using GetFinalPathNameByHandle similar to how it was used in previous versions of this code. There's still a case to handle and maybe some cleanup, but there's decent progress and hope that I can get it done very soon.
msg137499 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-06-02 23:15
Attached is a complete patch. All tests pass.

Lib/test/support.py
 * Handle AttributeError, which Hirokazu noticed on pre-XP machines

Lib/test/test_os.py
 * This sets up a three-deep directory tree and creates a symbolic link in the middle (second) directory.
 * os.stat calls are tested from below, equal to, and above the symbolic link to make sure they work. This is what was broken with the previous os.stat implementation on Windows, as it used a method which only provided correct information (the reparse tag) when the stat call was made from the directory where the link originates from.

Modules/posixmodule.c
 * win32_read_link, now called win32_get_reparse_tag, was changed to just get the reparse tag and that's it. The tag is now only used when setting the mode attribute based on if the tag shows that this is a symlink directory or file.

 * The GetFinalPathNameByHandle dynamic loading and preparing was moved up in the file since it is again used for stat calls.

 * I removed the major duplication of stat implementations for both narrow and wide paths. Now there is just one implementation, win32_xstat_impl, which accepts a wide path and does the same work.

 * win32_xstat_impl now takes an approach where if there's a symlink to be traversed, e.g., for an os.stat call, the path is then re-opened without the flag to open the reparse tag, effectively following the link path. Then we pass the newly opened file handle to GetFinalPathNameByHandle and we can get back the target path, which we then pass recursively. There's no more depth stuff - we either try to follow the link in the above fashion or not.

 * win32_lstat was never being called from anywhere so it was removed.

 * win32_stat, which takes a narrow path, is now converted early to a wide path and passed into the win32_xstat_impl rather than maintaining two of the same functions like before.
msg137796 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-06-07 09:46
Any Windows person going to review this one?
msg137800 - (view) Author: Tim Golden (tim.golden) (Python committer) Date: 2011-06-07 10:11
I'm just patching a clone now.
msg137801 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-07 10:59
issue12084.diff:
 - test_os pass with patched Python 3.3 on Windows 7 64 bits (and on Linux, Debian Sid)
 - in test_os: "finally: os.remove(file1)"  fails with file1 doesn't exist: a new try/finally should be used after "with open(file1, "w") as f:" block, or support.unlink() should be used instead
 - the patch doesn't apply cleany on the default Mercurial branch (fail on posixmodule.c), can you update your patch?
 - posixmodule.c still refer to win32_lstat() in a comment (a comment just before the removed win32_lstat() function)
 - win32_stat() must fail if the conversion to wide character failed. win32_stat() should use PyUnicode_DecodeFSDefault() instead of mbstowcs_s(). PyUnicode_DecodeFSDefault() allocates the memory and handles errors correctly: raise a nice Python error on decoding error (it uses the strict error handler for MBCS encoding, the ANSI code page).

win32_stat() decodes "manually" the filename from the ANSI code page and use the wide character API, instead of using the ANSI API. It is a little bit different than what is done for other functions of the posix module: other functions use the ANSI API, as specified by the PEP 277.
http://www.python.org/dev/peps/pep-0277/

Well, this PEP was written in 2002 when the default string type in Python (2) was the byte string. In 2011, with Python 3, the default string type is a character string and I can easily understand that you prefer to simplify the code by using a single string type. I also remember a discussion about deprecating byte filenames in Python 3.

I would prefer to discuss this point (decode byte string from the ANSI code page) on the mailing, and maybe also update the PEP 277.

The main question for me is how the ANSI API handles undecodable bytes: does it raise an error or ignore them? For stat(), ignoring undecodable bytes means that stat() will raise a "file not found error".

Most ANSI code pages never fail with a decoding error because they are 8 bits encoding and all bytes are mapped to characters. They are some multibyte code pages (like UTF-8), but I don't think that any Windows use such code page *by default*. I don't even know if it's possible to use a multibyte code page as the ANSI code page.

(I didn't check the symlink algorithm, I only cares about Unicode :-D)
msg137803 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-07 11:07
Win32SymlinkTests.test_rmdir_on_directory_link_to_missing_target() pass on my Windows 7 64 bits VM (with and without the patch), but is skipped:

    @unittest.skip("currently fails; consider for improvement")
    def test_rmdir_on_directory_link_to_missing_target(self):
        self._create_missing_dir_link()
        # consider allowing rmdir to remove directory links
        os.rmdir(self.missing_link)

On which OS does it fail? The skip may be changed to only skipped some Windows versions?
msg137805 - (view) Author: Tim Golden (tim.golden) (Python committer) Date: 2011-06-07 11:11
All expected tests pass on 3.2 branch (Win7 32-bit). The patch doesn't 
apply cleanly to trunk; not sure if it's expected to or not. The code
looks ok on paper. I'll leave Victor to quibble over the Unicode stuff...
msg137813 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-07 11:47
Oh oh. The situation is not a simple as expected. 3 functions only accept Unicode strings and 3 other functions decode "manually" byte strings from the ANSI code page.

--

chdir(), rmdir(), unlink(), access(), chmod(), link(), listdir(), _getfullpath(), mkdir(), utime(), open(), startfile(), unlink(), stat() and lstat() use the ANSI or the wide character API depending on the type of the input arguments.

rename(), symlink() and putenv() only use the wide character API. They use convert_to_unicode() to convert input arguments to Unicode. Byte strings are decoded from the file system encoding using the strict error error.

system(), readlink() and unsetenv() only accept Unicode strings.

--

Possible bugs.

unlink() uses DeleteFileA() for byte string and Py_DeleteFileW() for unicode. Py_DeleteFileW() has a special case for symbolic links:

/* override the default DeleteFileW behavior so that directory
symlinks can be removed with this function, the same as with
Unix symlinks */

unsetenv() encodes the variable name to UTF-8, which looks wrong to me.

startfile() encodes the second argument (operation) to UTF-8 and then decode it from ASCII to get a wchar_t* string. Why not using simply the "u" format to support more than ASCII characters?

It's surprising that unsetenv() only accept Unicode strings, because this Python function uses a C function with a bytes API (unsetenv).
msg137826 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-06-07 13:28
I should have specified - the patch is for 3.2. 2.7 code in this area is different but I'll get to that, and default/3.3 will also get this patch but it'll probably require some tweaking.

I guess I went overboard on the refactoring which is why Victor is seeing errors and has the issues with Unicode related stuff. I'll have to put back in the narrow version of the win32_xstat_impl function for now. I'm not comfortable messing with something new and fancy this late in the cycle so I'll just go back to a structure that worked and figure out something better later on.


I'd rather figure out the test_rmdir_on_directory_link_to_missing_target stuff on another issue - this issue has already held up the release long enough.
msg137841 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-06-07 15:38
Victor - does the new patch pass all tests for you on 3.2?
msg138256 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-13 15:18
issue12084_v2.diff doesn't patch os.lstat(bytes): os.lstat(bytes) should call win32_lstat() (which is removed by this patch) instead of stat().

test_os doesn't test os.stat()/os.lstat() with byte filenames. You can for example replace Win32SymlinkTests.check_stat() method by:

    def check_stat(self, link, target):
        self.assertEqual(os.stat(link), os.stat(target))
        self.assertNotEqual(os.lstat(link), os.stat(link))

        bytes_link = os.fsencode(link)
        self.assertEqual(os.stat(bytes_link), os.stat(target))
        self.assertNotEqual(os.lstat(bytes_link), os.stat(bytes_link))
msg138259 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-13 15:37
> os.lstat(bytes) should call win32_lstat()
> (which is removed by this patch) instead of stat()

Short history:
- 0a1baa619171: Fix #10027. st_nlink not set on Windows calls to os.stat/lstat
- 730b728e5aef: Implement #1578269. Patch by Jason R. Coombs.

730b728e5aef adds win32_lstat(), but it doesn't patch posix_lstat(). So your patch is not a regression, it's just that it was never supported.
msg138263 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-06-13 16:41
Here's a cleaned up patch which includes the test and lstat change Victor mentioned. I think this addresses everything we need to cover here. Can you run the tests once more with this new patch?
msg138274 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-13 19:45
> Here's a cleaned up patch which includes the test and lstat change
> Victor mentioned.

The test pass on Windows Seven 64 bits.
msg138277 - (view) Author: Roundup Robot (python-dev) Date: 2011-06-13 21:10
New changeset 1a3e8db28d49 by Brian Curtin in branch '3.2':
Fix #12084. os.stat on Windows wasn't working properly with relative symlinks.
http://hg.python.org/cpython/rev/1a3e8db28d49

New changeset c04c55afbf81 by Brian Curtin in branch 'default':
Merge from 3.2 for Issue #12084.
http://hg.python.org/cpython/rev/c04c55afbf81
msg138279 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-06-13 21:16
Well apparently that killed the XP build bots. Does anyone currently have access to XP that could test this?
msg138280 - (view) Author: Jason R. Coombs (jason.coombs) * (Python committer) Date: 2011-06-13 21:18
I'll take a look.
msg138281 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-06-13 21:22
It has something to do with the GetFinalPathNameByHandle dance.
msg138288 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-06-13 23:48
How about this patch?

We yield the GIL in posix_do_stat, so as Antoine pointed out in IRC, we were calling PyErr_SetString without having the GIL. I think this is the correct fix as I've stepped through the code in Visual Studio, forcing it to take this branch when on Win7, but I don't currently have the ability to run on XP.
msg138303 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2011-06-14 08:27
I tried issue12084_XP.diff, but os.stat()/os.lstat() always failed with following message because it raises exception on top of it when running on XP.

Python 3.2.1rc1+ (default, Jun 14 2011, 16:26:11) [MSC v.1200 32 bit (Intel)] on
 win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
[53981 refs]
>>> os.stat("src")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
WindowsError: [Error 127] 指定されたプロシージャが見つかりません。: 'src'
[54014 refs]
msg138304 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2011-06-14 08:34
I created several patches.
quick1.patch: os.stat() traverses junction on Vista/7, and raises error on XP.
quick2.patch: os.stat() never traverse junction on all windows.
quick3.patch: os.stat() should traverse junction os Vista/7, but doesn't on XP.

There are many patches because I don't know how to treat junction on os.stat().
msg138323 - (view) Author: Roundup Robot (python-dev) Date: 2011-06-14 15:06
New changeset 23e14af406df by Brian Curtin in branch 'default':
Merge 3.2 - update to the fix for #12084
http://hg.python.org/cpython/rev/23e14af406df
msg138324 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-06-14 15:08
I think quick3 is the way to go - checked in, we'll see how the buildbots react.

1524a60016d0 is the changeset for the 3.2 checkin (forgot to mention the issue# there)
msg138332 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-06-14 15:33
Just had a successful XP buildbot run: http://www.python.org/dev/buildbot/all/builders/x86%20XP-5%203.2/builds/304
History
Date User Action Args
2011-06-14 15:33:03brian.curtinsetstatus: open -> closed
resolution: fixed
messages: + msg138332

stage: patch review -> resolved
2011-06-14 15:08:46brian.curtinsetmessages: + msg138324
2011-06-14 15:06:51python-devsetmessages: + msg138323
2011-06-14 08:35:25ocean-citysetfiles: + quick3.patch
2011-06-14 08:34:46ocean-citysetfiles: + quick2.patch
2011-06-14 08:34:01ocean-citysetfiles: + quick1.patch

messages: + msg138304
2011-06-14 08:27:29ocean-citysetmessages: + msg138303
2011-06-13 23:48:48brian.curtinsetfiles: + issue12084_XP.diff

messages: + msg138288
2011-06-13 21:22:12brian.curtinsetmessages: + msg138281
2011-06-13 21:18:05jason.coombssetfiles: + smime.p7s

messages: + msg138280
2011-06-13 21:16:37brian.curtinsetmessages: + msg138279
2011-06-13 21:10:52python-devsetnosy: + python-dev
messages: + msg138277
2011-06-13 19:45:53hayposetmessages: + msg138274
2011-06-13 16:41:06brian.curtinsetfiles: + issue12084_v3.diff

messages: + msg138263
2011-06-13 15:37:45hayposetmessages: + msg138259
2011-06-13 15:18:09hayposetmessages: + msg138256
2011-06-07 15:38:43brian.curtinsetfiles: + issue12084_v2.diff

messages: + msg137841
2011-06-07 13:28:29brian.curtinsetmessages: + msg137826
2011-06-07 11:47:28hayposetmessages: + msg137813
2011-06-07 11:11:57tim.goldensetmessages: + msg137805
2011-06-07 11:07:22hayposetmessages: + msg137803
2011-06-07 10:59:55hayposetnosy: + haypo
messages: + msg137801
2011-06-07 10:11:16tim.goldensetnosy: + tim.golden
messages: + msg137800
2011-06-07 09:46:43georg.brandlsetmessages: + msg137796
2011-06-02 23:15:57brian.curtinsetkeywords: + needs review
files: + issue12084.diff
messages: + msg137499
2011-06-01 20:44:04brian.curtinsetmessages: + msg137458
2011-05-29 23:22:04ocean-citysetfiles: - patches_v2.tar.gz
2011-05-29 23:21:47ocean-citysetfiles: + patch_v2.txt
2011-05-29 23:17:59ocean-citysetfiles: + patch_v2.patch
2011-05-28 13:04:07jason.coombssetmessages: + msg137127
2011-05-28 01:07:51brian.curtinsetmessages: + msg137103
2011-05-26 20:24:40brian.curtinsetmessages: + msg137005
2011-05-24 14:16:42brian.curtinsetmessages: + msg136750
2011-05-24 02:40:35brian.curtinsetmessages: + msg136716
2011-05-24 02:01:39brian.curtinsetfiles: + test.diff
keywords: + patch
messages: + msg136711
2011-05-19 03:00:01brian.curtinsetmessages: + msg136266
2011-05-18 16:49:05jason.coombssetmessages: + msg136253
2011-05-18 13:40:45brian.curtinsetassignee: brian.curtin
messages: + msg136244
2011-05-18 07:55:52georg.brandlsetmessages: + msg136209
2011-05-17 03:12:06brian.curtinsetmessages: - msg136132
2011-05-17 00:57:58brian.curtinsetmessages: + msg136132
2011-05-16 19:45:53santa4ntsetnosy: + santa4nt
2011-05-16 16:08:48brian.curtinsetnosy: + brian.curtin
messages: + msg136109
2011-05-16 06:39:56eric.smithsetnosy: + jason.coombs, eric.smith
2011-05-16 06:02:36georg.brandlsetnosy: + loewis
messages: + msg136069
2011-05-16 03:26:58nadeem.vawdasetnosy: + nadeem.vawda
2011-05-16 02:58:51ocean-citylinkissue9927 superseder
2011-05-16 02:53:22ocean-citycreate