New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os.listdir breaks with literal paths #57443
Comments
During the development of an application that needed to write paths longer than 260 chars we opted to use \\?\ as per http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#maxpath. When working with literal paths the following the os.listdir funtion would return the following trace: >>> import os
>>> test = r'\\?\C:\Python27'
>>> os.listdir(test)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: '\\\\?\\C:\\Python27/*.*' The reason for this is that the implementation of listdir appends '/' at the end of the path if os.path.sep is not present at the end of it which FindFirstFile does not like. This is a inconsistency from the OS but it can be easily fixed (see attached patch). |
I'd also like to point out that Unicode path is handled correctly in both 2.7.x Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.listdir(u'\\\\?\\D:\\Temp\\tempdir')
[u'sub1', u'test1.txt']
>>> os.listdir('\\\\?\\D:\\Temp\\tempdir')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
WindowsError: [Error 123] The filename, directory name, or volume label syntax i
s incorrect: '\\\\?\\D:\\Temp\\tempdir/*.*'
Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit (AMD64)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.listdir('\\\\?\\D:\\Temp\\tempdir')
['sub1', 'test1.txt']
>>> os.listdir(b'\\\\?\\D:\\Temp\\tempdir')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
WindowsError: [Error 123] The filename, directory name, or volume label syntax i
s incorrect: '\\\\?\\D:\\Temp\\tempdir/*.*' The problem only lies in the code handling narrow string paths. If you look at To be consistent, we should use '\\'. |
There are also several other edge cases to be taken care of: Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.listdir(r'\\?\C:\Python27/')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
WindowsError: [Error 123] The filename, directory name, or volume label syntax i
s incorrect: '\\\\?\\C:\\Python27/*.*'
>>> os.listdir(r'\\?\C:/Python27\Lib')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
WindowsError: [Error 3] The system cannot find the path specified: '\\\\?\\C:/Py
thon27\\Lib/*.*' |
Additionally, there might be issues in other APIs when handling with extended path lengths: D:\Temp\tempdir>dir Directory of D:\Temp\tempdir 10/24/2011 04:22 PM <DIR> . D:\Temp\tempdir>cd AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA D:\Temp\tempdir\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Directory of D:\Temp\tempdir\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 10/24/2011 04:28 PM <DIR> . Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit (AMD64)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> subdir = 'B'*13
>>> os.path.isdir(subdir)
False
>>> os.getcwd()
'D:\\Temp\\tempdir\\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAA'
>>> subdir_abs = os.path.join(os.getcwd(), subdir)
>>> os.path.isdir(subdir)
False
>>> subdir_ext = r'\\?\%s' % subdir_abs
>>> os.path.isdir(subdir_ext)
True In the above example, perhaps a ValueError('path too long') is better than returning False? |
Indeed, in our code we had to write a number of wrappers around the os calls to be able to work with long path on Windows. At the moment working with long paths on windows and python is broken in a number of places and is a PITA to work with. |
Thanks for the patch. Is there a reason you don't use shutil.rmtree in tearDown()? |
In case of my patch (I don't know about santa4nt case) I did not use shutil.remove because it was not used in the other tests and I wanted to be consistent and not add a new import. Certainly if there is not an issue with that we should use it. |
Even if we decide not to convert any forward slash, listdir() adds u"\\.*" when the input is unicode, but it adds "/.*" when it is not, before passing it off to Windows API. Hence the inconsistency and the problem Manuel saw. IMO, his patch shouldn't have differentiated if the path starts with r"\\?\" and just be consistent with adding "\\*.*", unicode or not. |
This issue is getting messy. I declare that this issue is *only* about the original problem reported in msg146031. When that is fixed, this issue will be closed, and any further issues need to be reported separately. As for the original problem, ISTM that the right fix is to replace
with
No further change should be necessary. |
Addressing patch comments. |
Fair enough. Simplifying. |
+ if (ch != '\\' && ch != '/' && ch != ':') I don't understand this change in issue13234_py33_v4.patch (the change looks to be useless). |
It's pedantic correctness on my part. SEP and ALTSEP are defined as wide strings L'\\' and L'/' respectively. Their usage in the unicode conditional branch and the bytes conditional branch seem to have been reversed. |
I added minor comments in Rietveld. Santoso Wijaya, can you please submit a contributor form? http://python.org/psf/contrib/contrib-form/ |
Done. |
Santoso Wijaya: sorry for the delay. If you'd like to retarget your patch against the tip, I'm happy to apply. At this stage, 3.3 and 3.4 seem the appropriate branches. |
Here you go. |
New changeset 12aaa2943791 by Tim Golden in branch 'default': New changeset 5c187d6162c5 by Tim Golden in branch 'default': |
Applied. Thanks for the patch. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: