classification
Title: Implement os.link on Windows
Type: enhancement Stage: resolved
Components: Extension Modules, Windows Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: 7566 Superseder:
Assigned To: brian.curtin Nosy List: amaury.forgeotdarc, brian.curtin, giampaolo.rodola, jcea, ocean-city
Priority: normal Keywords: needs review, patch

Created on 2010-06-02 18:47 by brian.curtin, last changed 2011-01-04 04:13 by brian.curtin. This issue is now closed.

Files
File name Uploaded Description Edit
issue8879.diff brian.curtin, 2010-09-23 21:27 review
issue8879_v2.diff brian.curtin, 2010-09-29 18:51 review
py3k_quick_hack_for_issue8879.patch ocean-city, 2010-10-01 21:00 review
issue8879_unicode.diff brian.curtin, 2010-11-28 23:39
Messages (20)
msg106908 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-06-02 18:47
Add os.link support for Windows


(mostly a reminder to myself to finish the patch I have)
msg117236 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-09-23 21:27
Here's a patch of the functionality. The doc change is small and easy, I'll add it shortly.

One oddity is that os.path.samefile is False for a link and its source, but True for a symlink and its source. I find that to be quite odd, but stepping through the code I don't see where this is anything we are doing. I haven't been able to figure out if this is really expected behavior from Windows. When the files are open, os.path.sameopenfile shows that the link and its source are the same, which is expected.
msg117238 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-09-23 22:02
With following implementation, issamefile return True
for hard link. I heard GetFinalPathNameByHandle
returns different paths for hard links.

>>> import nt, os
>>> def issamefile(path1, path2):
...     fd1 = os.open(path1, os.O_RDONLY)
...     fd2 = os.open(path2, os.O_RDONLY)
...     fi1 = nt._getfileinformation(fd1)
...     fi2 = nt._getfileinformation(fd2)
...     os.close(fd1)
...     os.close(fd2)
...     return fi1 == fi2
...
>>> issamefile("src.txt", "lnk.txt")
True
msg117325 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-09-24 18:36
In a round-about way, that's what the tests do via sameopenfile.

Maybe the behavior of os.path.samefile for Windows hard links should be documented? Possibly just a small note explaining that os.path.sameopenfile is an alternate solution due to how GetFinalPathNameByHandle works.
msg117642 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-09-29 18:51
Now that I think of it, that behavior is expected. Hard links are *supposed* to be different directory entries, so they would come out of that function as-is, and the current tests are correctly implemented.

Uploaded to http://codereview.appspot.com/2290042
msg117828 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-10-01 18:58
test_tarfile fails with this patch.
msg117834 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-10-01 21:00
I think this is because os.stat and os.lstat doesn't set st_nlink.
It is only set via os.fstat. I'll attach quick hack patch. (Again,
this is just quick hack patch)

I confirmed this passes test_os and test_tarfile.
msg118013 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-10-05 14:39
I created #10027 for the st_nlink issue to handle it separately.
msg122305 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-11-24 20:26
Removing link to #10027. It's fixed for py3k but the issue should stay open for backport to other branches.
msg122306 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-11-24 20:26
Fixed in r86733.
msg122319 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-11-25 00:11
The patch uses the ANSI version, and converts the filename from unicode to bytes; this will fail for names that cannot be encoded with the "mbcs" codec.

All other functions in the posix module first try the Wide version of the win32 API, and use the ANSI version as a fallback, when the argument is a bytes string. PyUnicode_FSConverter should be avoided on Windows. See posix_mkdir() for a good example.

Here is an example that fails on a Western Windows (where ANSI=cp1252).  Note that even the file name is not encodable in mbcs, it is correctly displayed in the Explorer for example.

>>> name = "\u65e5\u672c"     # "Japan"
>>> open(name, "w").close()   # Appears correctly in the explorer
>>> import os
>>> os.link(name, name + "_1")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: invalid character
msg122341 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-11-25 07:58
r86733 introduces st_ino setup in attribute_data_to_stat(),
but Fild ID is not guaranteed to be same when file is not opened.
So I think it's meaningful only for os.fstat().

Please see
http://msdn.microsoft.com/en-us/library/aa363788%28v=VS.85%29.aspx
msg122363 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-11-25 14:52
I'll come up with a patch for Amaury's message.

Hirokazu - I didn't see that MSDN page, thanks. Without st_ino, I'll need to find a way around the block of lines 1941-1954 in Lib/tarfile.py. That's what was causing a test failure in the first place because it ends up assuming Windows files are always REGTYPE, which sets some things later on in that function differently than they should be for links, namely tarinfo.size.
msg122415 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-11-25 22:49
Umm, I'm not sure how to fix this yet, but
if we create the function like os._stat_with_open_handle()
which returns stat struct and open handle on windows,
I think we can set/use st_ino. (Because FileID won't change
while file is opened) I'm not sure if it is appropriate API
change or not, though. :-(
msg122748 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-11-28 23:39
Amaury -- how does issue8879_unicode.diff look? Made the suggested change and added a test.
msg122749 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-11-28 23:47
Looks good, except that win32_error("link", NULL) should be called instead of posix_error(). errno is not guaranteed to be correctly set by the win32 api, and GetLastError() ususally gives more accurate errors.
msg122750 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-11-29 00:01
Committed in r86854 with your win32_error suggestion. Thanks for your help and input.
msg122767 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2010-11-29 04:25
This patch seems to break quite a few buildbots.

for instance: http://www.python.org/dev/buildbot/all/builders/x86%20OpenIndiana%203.x/builds/81/steps/test/logs/stdio

"""
test_mbcs_name (test.test_os.LinkTests) ... test test_os failed -- Traceback (most recent call last):
  File "/export/home/buildbot/32bits/3.x.cea-indiana-x86/build/Lib/test/test_os.py", line 893, in test_mbcs_name
    self._test_link(self.file1, self.file2)
  File "/export/home/buildbot/32bits/3.x.cea-indiana-x86/build/Lib/test/test_os.py", line 876, in _test_link
    with open(file1, "w") as f1:
UnicodeEncodeError: 'ascii' codec can't encode characters in position 15-16: ordinal not in range(128)
"""
msg122770 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-11-29 04:33
Maybe the test should be Windows-only?
I don't really know the answer...things tend to fall apart when I get involved with Unicode, encoding, codecs, etc. :/
msg122771 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2010-11-29 04:53
I am not familiar with this either, but better to restrict the test to platforms with actual mbcs encoding.
History
Date User Action Args
2011-01-04 04:13:06brian.curtinsetstatus: open -> closed
nosy: jcea, amaury.forgeotdarc, ocean-city, giampaolo.rodola, brian.curtin
stage: patch review -> resolved
2010-11-29 04:53:20jceasetmessages: + msg122771
2010-11-29 04:33:34brian.curtinsetmessages: + msg122770
2010-11-29 04:25:57jceasetnosy: + jcea
messages: + msg122767
2010-11-29 00:01:28brian.curtinsetmessages: + msg122750
2010-11-28 23:47:59amaury.forgeotdarcsetmessages: + msg122749
2010-11-28 23:39:08brian.curtinsetfiles: + issue8879_unicode.diff

messages: + msg122748
stage: resolved -> patch review
2010-11-26 00:42:25ocean-citysetmessages: - msg122421
2010-11-26 00:26:51ocean-citysetmessages: + msg122421
2010-11-25 22:49:40ocean-citysetmessages: + msg122415
2010-11-25 14:52:13brian.curtinsetmessages: + msg122363
2010-11-25 07:58:01ocean-citysetmessages: + msg122341
2010-11-25 00:11:31amaury.forgeotdarcsetstatus: closed -> open
nosy: + amaury.forgeotdarc
messages: + msg122319

2010-11-24 20:26:44brian.curtinsetstatus: open -> closed

messages: + msg122306
stage: patch review -> resolved
2010-11-24 20:26:23brian.curtinsetresolution: fixed
dependencies: - os.lstat/os.stat don't set st_nlink on Windows
messages: + msg122305
2010-10-05 14:39:44brian.curtinsetdependencies: + os.lstat/os.stat don't set st_nlink on Windows
messages: + msg118013
2010-10-01 21:00:36ocean-citysetfiles: + py3k_quick_hack_for_issue8879.patch

messages: + msg117834
2010-10-01 18:58:39brian.curtinsetmessages: + msg117828
2010-09-29 18:51:14brian.curtinsetfiles: + issue8879_v2.diff

messages: + msg117642
2010-09-24 18:36:10brian.curtinsetmessages: + msg117325
2010-09-23 22:02:24ocean-citysetnosy: + ocean-city
messages: + msg117238
2010-09-23 21:27:09brian.curtinsetkeywords: + patch, needs review
files: + issue8879.diff
messages: + msg117236

stage: needs patch -> patch review
2010-07-23 15:06:25brian.curtinsetdependencies: + Add ntpath.sameopenfile support for Windows
2010-06-02 21:47:22giampaolo.rodolasetnosy: + giampaolo.rodola
2010-06-02 18:47:36brian.curtincreate