msg146031 - (view) |
Author: Manuel de la Pena (mandel) |
Date: 2011-10-20 16:54 |
During the development of an application that needed to write paths longer than 260 chars we opted to use \\?\ as per http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#maxpath.
When working with literal paths the following the os.listdir funtion would return the following trace:
>>> import os
>>> test = r'\\?\C:\Python27'
>>> os.listdir(test)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: '\\\\?\\C:\\Python27/*.*'
The reason for this is that the implementation of listdir appends '/' at the end of the path if os.path.sep is not present at the end of it which FindFirstFile does not like. This is a inconsistency from the OS but it can be easily fixed (see attached patch).
|
msg146338 - (view) |
Author: Santoso Wijaya (santoso.wijaya) * |
Date: 2011-10-24 23:33 |
I'd also like to point out that Unicode path is handled correctly in both 2.7.x
and 3.x:
Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.listdir(u'\\\\?\\D:\\Temp\\tempdir')
[u'sub1', u'test1.txt']
>>> os.listdir('\\\\?\\D:\\Temp\\tempdir')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
WindowsError: [Error 123] The filename, directory name, or volume label syntax i
s incorrect: '\\\\?\\D:\\Temp\\tempdir/*.*'
Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit (AMD64)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.listdir('\\\\?\\D:\\Temp\\tempdir')
['sub1', 'test1.txt']
>>> os.listdir(b'\\\\?\\D:\\Temp\\tempdir')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
WindowsError: [Error 123] The filename, directory name, or volume label syntax i
s incorrect: '\\\\?\\D:\\Temp\\tempdir/*.*'
The problem only lies in the code handling narrow string paths. If you look at
(py33) posixmodule.c:2545-6, the bare path is appended with L"\\*.*". Whereas
in the narrow string case, posixmodule.c:2625-6, the bare path is appended with
"/*.*", instead.
To be consistent, we should use '\\'.
|
msg146339 - (view) |
Author: Santoso Wijaya (santoso.wijaya) * |
Date: 2011-10-24 23:34 |
There are also several other edge cases to be taken care of:
Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.listdir(r'\\?\C:\Python27/')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
WindowsError: [Error 123] The filename, directory name, or volume label syntax i
s incorrect: '\\\\?\\C:\\Python27/*.*'
>>> os.listdir(r'\\?\C:/Python27\Lib')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
WindowsError: [Error 3] The system cannot find the path specified: '\\\\?\\C:/Py
thon27\\Lib/*.*'
|
msg146341 - (view) |
Author: Santoso Wijaya (santoso.wijaya) * |
Date: 2011-10-25 00:00 |
Additionally, there might be issues in other APIs when handling with extended path lengths:
D:\Temp\tempdir>dir
Volume in drive D is Data
Volume Serial Number is 7E3D-EC81
Directory of D:\Temp\tempdir
10/24/2011 04:22 PM <DIR> .
10/24/2011 04:22 PM <DIR> ..
10/24/2011 04:28 PM <DIR> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
10/24/2011 01:31 PM <DIR> sub1
10/24/2011 03:39 PM 262 test1.txt
1 File(s) 262 bytes
4 Dir(s) 244,483,321,856 bytes free
D:\Temp\tempdir>cd AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAA
D:\Temp\tempdir\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAA>dir
Volume in drive D is Data
Volume Serial Number is 7E3D-EC81
Directory of D:\Temp\tempdir\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAA
10/24/2011 04:28 PM <DIR> .
10/24/2011 04:28 PM <DIR> ..
10/24/2011 04:14 PM 0 1234567.txt
10/24/2011 04:28 PM <DIR> BBBBBBBBBBBBB
1 File(s) 0 bytes
3 Dir(s) 244,483,321,856 bytes free
Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit (AMD64)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> subdir = 'B'*13
>>> os.path.isdir(subdir)
False
>>> os.getcwd()
'D:\\Temp\\tempdir\\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAA'
>>> subdir_abs = os.path.join(os.getcwd(), subdir)
>>> os.path.isdir(subdir)
False
>>> subdir_ext = r'\\?\%s' % subdir_abs
>>> os.path.isdir(subdir_ext)
True
In the above example, perhaps a ValueError('path too long') is better than returning False?
|
msg146346 - (view) |
Author: Manuel de la Pena (mandel) |
Date: 2011-10-25 08:54 |
Indeed, in our code we had to write a number of wrappers around the os calls to be able to work with long path on Windows. At the moment working with long paths on windows and python is broken in a number of places and is a PITA to work with.
|
msg146350 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-10-25 09:47 |
Thanks for the patch. Is there a reason you don't use shutil.rmtree in tearDown()?
I don't know if it's our business to convert forward slashes passed by the user. Also, general support for extended paths is another can of worms ;)
|
msg146353 - (view) |
Author: Manuel de la Pena (mandel) |
Date: 2011-10-25 10:12 |
In case of my patch (I don't know about santa4nt case) I did not use shutil.remove because it was not used in the other tests and I wanted to be consistent and not add a new import. Certainly if there is not an issue with that we should use it.
|
msg146375 - (view) |
Author: Santoso Wijaya (santoso.wijaya) * |
Date: 2011-10-25 15:48 |
Even if we decide not to convert any forward slash, listdir() adds u"\\*.*" when the input is unicode, but it adds "/*.*" when it is not, before passing it off to Windows API. Hence the inconsistency and the problem Manuel saw.
IMO, his patch shouldn't have differentiated if the path starts with r"\\?\" and just be consistent with adding "\\*.*", unicode or not.
|
msg146388 - (view) |
Author: Martin v. Löwis (loewis) * |
Date: 2011-10-25 18:33 |
This issue is getting messy. I declare that this issue is *only* about the original problem reported in msg146031. When that is fixed, this issue will be closed, and any further issues need to be reported separately.
As for the original problem, ISTM that the right fix is to replace
namebuf[len++] = '/';
with
namebuf[len++] = '\\';
No further change should be necessary.
|
msg146389 - (view) |
Author: Santoso Wijaya (santoso.wijaya) * |
Date: 2011-10-25 18:34 |
Addressing patch comments.
|
msg146391 - (view) |
Author: Santoso Wijaya (santoso.wijaya) * |
Date: 2011-10-25 18:42 |
Fair enough. Simplifying.
|
msg146398 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2011-10-25 20:19 |
- if (ch != SEP && ch != ALTSEP && ch != ':')
+ if (ch != '\\' && ch != '/' && ch != ':')
I don't understand this change in issue13234_py33_v4.patch (the change looks to be useless).
|
msg146400 - (view) |
Author: Santoso Wijaya (santoso.wijaya) * |
Date: 2011-10-25 20:51 |
> I don't understand this change in issue13234_py33_v4.patch (the change looks to be useless).
It's pedantic correctness on my part. SEP and ALTSEP are defined as wide strings L'\\' and L'/' respectively. Their usage in the unicode conditional branch and the bytes conditional branch seem to have been reversed.
|
msg181517 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2013-02-06 13:21 |
I added minor comments in Rietveld.
Santoso Wijaya, can you please submit a contributor form?
http://python.org/psf/contrib/contrib-form/
http://python.org/psf/contrib/
|
msg184822 - (view) |
Author: Santoso Wijaya (santoso.wijaya) * |
Date: 2013-03-21 01:08 |
Done.
|
msg201140 - (view) |
Author: Tim Golden (tim.golden) * |
Date: 2013-10-24 14:13 |
Santoso Wijaya: sorry for the delay. If you'd like to retarget your patch against the tip, I'm happy to apply. At this stage, 3.3 and 3.4 seem the appropriate branches.
|
msg201161 - (view) |
Author: Santoso Wijaya (santoso.wijaya) * |
Date: 2013-10-24 18:40 |
Here you go.
|
msg201288 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2013-10-25 20:26 |
New changeset 12aaa2943791 by Tim Golden in branch 'default':
Issue13234 Allow listdir to handle extended paths on Windows (Patch by Santoso Wijaya)
http://hg.python.org/cpython/rev/12aaa2943791
New changeset 5c187d6162c5 by Tim Golden in branch 'default':
Issue13234 Credit Santoso for the patch and add NEWS item
http://hg.python.org/cpython/rev/5c187d6162c5
|
msg201290 - (view) |
Author: Tim Golden (tim.golden) * |
Date: 2013-10-25 20:46 |
Applied. Thanks for the patch.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:22 | admin | set | github: 57443 |
2013-10-25 20:46:23 | tim.golden | set | status: open -> closed resolution: fixed messages:
+ msg201290
stage: patch review -> resolved |
2013-10-25 20:26:54 | python-dev | set | nosy:
+ python-dev messages:
+ msg201288
|
2013-10-24 18:40:07 | santoso.wijaya | set | files:
+ issue13234_tip_refresh.patch
messages:
+ msg201161 |
2013-10-24 14:13:43 | tim.golden | set | assignee: tim.golden messages:
+ msg201140 versions:
- Python 2.7, Python 3.2 |
2013-03-21 01:08:28 | santoso.wijaya | set | messages:
+ msg184822 |
2013-02-06 13:21:27 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka
messages:
+ msg181517 versions:
+ Python 3.4 |
2011-10-25 20:51:35 | santoso.wijaya | set | messages:
+ msg146400 |
2011-10-25 20:19:53 | vstinner | set | messages:
+ msg146398 |
2011-10-25 18:42:10 | santoso.wijaya | set | files:
+ issue13234_py33_v4.patch
messages:
+ msg146391 |
2011-10-25 18:34:13 | santoso.wijaya | set | files:
+ issue13234_py33_v3.patch
messages:
+ msg146389 |
2011-10-25 18:33:40 | loewis | set | messages:
+ msg146388 |
2011-10-25 15:48:46 | santoso.wijaya | set | messages:
+ msg146375 |
2011-10-25 10:12:34 | mandel | set | messages:
+ msg146353 |
2011-10-25 09:47:24 | pitrou | set | messages:
+ msg146350 |
2011-10-25 08:54:36 | mandel | set | messages:
+ msg146346 |
2011-10-25 00:00:40 | santoso.wijaya | set | messages:
+ msg146341 |
2011-10-24 23:34:18 | santoso.wijaya | set | files:
+ issue13234_py33_v2.patch
messages:
+ msg146339 |
2011-10-24 23:33:14 | santoso.wijaya | set | files:
+ issue13234_py33.patch
messages:
+ msg146338 |
2011-10-22 00:03:07 | santoso.wijaya | set | type: behavior components:
+ Windows |
2011-10-22 00:02:56 | santoso.wijaya | set | nosy:
+ santoso.wijaya
|
2011-10-21 22:16:34 | terry.reedy | set | nosy:
+ loewis
|
2011-10-20 17:43:59 | pitrou | set | nosy:
+ vstinner
|
2011-10-20 17:43:49 | pitrou | set | nosy:
+ pitrou, tim.golden, brian.curtin stage: patch review
versions:
+ Python 3.2, Python 3.3 |
2011-10-20 16:54:45 | mandel | create | |