classification
Title: os.listdir breaks with literal paths
Type: behavior Stage: resolved
Components: Library (Lib), Windows Versions: Python 3.3, Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: tim.golden Nosy List: brian.curtin, loewis, mandel, pitrou, python-dev, santoso.wijaya, serhiy.storchaka, tim.golden, vstinner
Priority: normal Keywords: patch

Created on 2011-10-20 16:54 by mandel, last changed 2013-10-25 20:46 by tim.golden. This issue is now closed.

Files
File name Uploaded Description Edit
listdir.patch mandel, 2011-10-20 16:54 patch that ensures that the correct os.sep is used when calling FindFirstFile review
issue13234_py33.patch santoso.wijaya, 2011-10-24 23:33 review
issue13234_py33_v2.patch santoso.wijaya, 2011-10-24 23:34 review
issue13234_py33_v3.patch santoso.wijaya, 2011-10-25 18:34 review
issue13234_py33_v4.patch santoso.wijaya, 2011-10-25 18:42 Simplified patch review
issue13234_tip_refresh.patch santoso.wijaya, 2013-10-24 18:40 tip refresh review
Messages (19)
msg146031 - (view) Author: Manuel de la Pena (mandel) Date: 2011-10-20 16:54
During the development of an application that needed to write paths longer than 260 chars we opted to use \\?\ as per http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#maxpath.

When working with literal paths the following the os.listdir funtion would return the following trace:

>>> import os
>>> test = r'\\?\C:\Python27'
>>> os.listdir(test)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: '\\\\?\\C:\\Python27/*.*'

The reason for this is that the implementation of listdir appends '/' at the end of the path if os.path.sep is not present at the end of it which FindFirstFile does not like. This is a inconsistency from the OS but it can be easily fixed (see attached patch).
msg146338 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2011-10-24 23:33
I'd also like to point out that Unicode path is handled correctly in both 2.7.x
and 3.x:

    Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win
    32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os
    >>> os.listdir(u'\\\\?\\D:\\Temp\\tempdir')
    [u'sub1', u'test1.txt']
    >>> os.listdir('\\\\?\\D:\\Temp\\tempdir')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    WindowsError: [Error 123] The filename, directory name, or volume label syntax i
    s incorrect: '\\\\?\\D:\\Temp\\tempdir/*.*'

    Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit (AMD64)] on win
    32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os
    >>> os.listdir('\\\\?\\D:\\Temp\\tempdir')
    ['sub1', 'test1.txt']
    >>> os.listdir(b'\\\\?\\D:\\Temp\\tempdir')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    WindowsError: [Error 123] The filename, directory name, or volume label syntax i
    s incorrect: '\\\\?\\D:\\Temp\\tempdir/*.*'

The problem only lies in the code handling narrow string paths. If you look at
(py33) posixmodule.c:2545-6, the bare path is appended with L"\\*.*". Whereas
in the narrow string case, posixmodule.c:2625-6, the bare path is appended with
"/*.*", instead.

To be consistent, we should use '\\'.
msg146339 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2011-10-24 23:34
There are also several other edge cases to be taken care of:

    Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win
    32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os
    >>> os.listdir(r'\\?\C:\Python27/')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    WindowsError: [Error 123] The filename, directory name, or volume label syntax i
    s incorrect: '\\\\?\\C:\\Python27/*.*'
    >>> os.listdir(r'\\?\C:/Python27\Lib')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    WindowsError: [Error 3] The system cannot find the path specified: '\\\\?\\C:/Py
    thon27\\Lib/*.*'
msg146341 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2011-10-25 00:00
Additionally, there might be issues in other APIs when handling with extended path lengths:

D:\Temp\tempdir>dir
 Volume in drive D is Data
 Volume Serial Number is 7E3D-EC81

 Directory of D:\Temp\tempdir

10/24/2011  04:22 PM    <DIR>          .
10/24/2011  04:22 PM    <DIR>          ..
10/24/2011  04:28 PM    <DIR>          AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
10/24/2011  01:31 PM    <DIR>          sub1
10/24/2011  03:39 PM               262 test1.txt
               1 File(s)            262 bytes
               4 Dir(s)  244,483,321,856 bytes free

D:\Temp\tempdir>cd AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAA

D:\Temp\tempdir\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAA>dir
 Volume in drive D is Data
 Volume Serial Number is 7E3D-EC81

 Directory of D:\Temp\tempdir\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAA

10/24/2011  04:28 PM    <DIR>          .
10/24/2011  04:28 PM    <DIR>          ..
10/24/2011  04:14 PM                 0 1234567.txt
10/24/2011  04:28 PM    <DIR>          BBBBBBBBBBBBB
               1 File(s)              0 bytes
               3 Dir(s)  244,483,321,856 bytes free

Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit (AMD64)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> subdir = 'B'*13
>>> os.path.isdir(subdir)
False
>>> os.getcwd()
'D:\\Temp\\tempdir\\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAA'
>>> subdir_abs = os.path.join(os.getcwd(), subdir)
>>> os.path.isdir(subdir)
False
>>> subdir_ext = r'\\?\%s' % subdir_abs
>>> os.path.isdir(subdir_ext)
True

In the above example, perhaps a ValueError('path too long') is better than returning False?
msg146346 - (view) Author: Manuel de la Pena (mandel) Date: 2011-10-25 08:54
Indeed, in our code we had to write a number of wrappers around the os calls to be able to work with long path on Windows. At the moment working with long paths on windows and python is broken in a number of places and is a PITA to work with.
msg146350 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-25 09:47
Thanks for the patch. Is there a reason you don't use shutil.rmtree in tearDown()?
I don't know if it's our business to convert forward slashes passed by the user. Also, general support for extended paths is another can of worms ;)
msg146353 - (view) Author: Manuel de la Pena (mandel) Date: 2011-10-25 10:12
In case of my patch (I don't know about santa4nt case) I did not use shutil.remove because it was not used in the other tests and I wanted to be consistent and not add a new import. Certainly if there is not an issue with that we should use it.
msg146375 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2011-10-25 15:48
Even if we decide not to convert any forward slash, listdir() adds u"\\*.*" when the input is unicode, but it adds "/*.*" when it is not, before passing it off to Windows API. Hence the inconsistency and the problem Manuel saw.

IMO, his patch shouldn't have differentiated if the path starts with r"\\?\" and just be consistent with adding "\\*.*", unicode or not.
msg146388 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-10-25 18:33
This issue is getting messy. I declare that this issue is *only* about the original problem reported in msg146031. When that is fixed, this issue will be closed, and any further issues need to be reported separately.

As for the original problem, ISTM that the right fix is to replace

    namebuf[len++] = '/';

with

    namebuf[len++] = '\\';

No further change should be necessary.
msg146389 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2011-10-25 18:34
Addressing patch comments.
msg146391 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2011-10-25 18:42
Fair enough. Simplifying.
msg146398 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-10-25 20:19
-        if (ch != SEP && ch != ALTSEP && ch != ':')
+        if (ch != '\\' && ch != '/' && ch != ':')

I don't understand this change in issue13234_py33_v4.patch (the change looks to be useless).
msg146400 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2011-10-25 20:51
> I don't understand this change in issue13234_py33_v4.patch (the change looks to be useless).

It's pedantic correctness on my part. SEP and ALTSEP are defined as wide strings L'\\' and L'/' respectively. Their usage in the unicode conditional branch and the bytes conditional branch seem to have been reversed.
msg181517 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-02-06 13:21
I added minor comments in Rietveld.

Santoso Wijaya, can you please submit a contributor form?

http://python.org/psf/contrib/contrib-form/
http://python.org/psf/contrib/
msg184822 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2013-03-21 01:08
Done.
msg201140 - (view) Author: Tim Golden (tim.golden) * (Python committer) Date: 2013-10-24 14:13
Santoso Wijaya: sorry for the delay. If you'd like to retarget your patch against the tip, I'm happy to apply. At this stage, 3.3 and 3.4 seem the appropriate branches.
msg201161 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2013-10-24 18:40
Here you go.
msg201288 - (view) Author: Roundup Robot (python-dev) Date: 2013-10-25 20:26
New changeset 12aaa2943791 by Tim Golden in branch 'default':
Issue13234 Allow listdir to handle extended paths on Windows (Patch by Santoso Wijaya)
http://hg.python.org/cpython/rev/12aaa2943791

New changeset 5c187d6162c5 by Tim Golden in branch 'default':
Issue13234 Credit Santoso for the patch and add NEWS item
http://hg.python.org/cpython/rev/5c187d6162c5
msg201290 - (view) Author: Tim Golden (tim.golden) * (Python committer) Date: 2013-10-25 20:46
Applied. Thanks for the patch.
History
Date User Action Args
2013-10-25 20:46:23tim.goldensetstatus: open -> closed
resolution: fixed
messages: + msg201290

stage: patch review -> resolved
2013-10-25 20:26:54python-devsetnosy: + python-dev
messages: + msg201288
2013-10-24 18:40:07santoso.wijayasetfiles: + issue13234_tip_refresh.patch

messages: + msg201161
2013-10-24 14:13:43tim.goldensetassignee: tim.golden
messages: + msg201140
versions: - Python 2.7, Python 3.2
2013-03-21 01:08:28santoso.wijayasetmessages: + msg184822
2013-02-06 13:21:27serhiy.storchakasetnosy: + serhiy.storchaka

messages: + msg181517
versions: + Python 3.4
2011-10-25 20:51:35santoso.wijayasetmessages: + msg146400
2011-10-25 20:19:53vstinnersetmessages: + msg146398
2011-10-25 18:42:10santoso.wijayasetfiles: + issue13234_py33_v4.patch

messages: + msg146391
2011-10-25 18:34:13santoso.wijayasetfiles: + issue13234_py33_v3.patch

messages: + msg146389
2011-10-25 18:33:40loewissetmessages: + msg146388
2011-10-25 15:48:46santoso.wijayasetmessages: + msg146375
2011-10-25 10:12:34mandelsetmessages: + msg146353
2011-10-25 09:47:24pitrousetmessages: + msg146350
2011-10-25 08:54:36mandelsetmessages: + msg146346
2011-10-25 00:00:40santoso.wijayasetmessages: + msg146341
2011-10-24 23:34:18santoso.wijayasetfiles: + issue13234_py33_v2.patch

messages: + msg146339
2011-10-24 23:33:14santoso.wijayasetfiles: + issue13234_py33.patch

messages: + msg146338
2011-10-22 00:03:07santoso.wijayasettype: behavior
components: + Windows
2011-10-22 00:02:56santoso.wijayasetnosy: + santoso.wijaya
2011-10-21 22:16:34terry.reedysetnosy: + loewis
2011-10-20 17:43:59pitrousetnosy: + vstinner
2011-10-20 17:43:49pitrousetnosy: + pitrou, tim.golden, brian.curtin
stage: patch review

versions: + Python 3.2, Python 3.3
2011-10-20 16:54:45mandelcreate