classification
Title: Buffer overflow when listing deeply nested directory
Type: behavior Stage: test needed
Components: Windows Versions: Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: arno-cs, loewis, serhiy.storchaka, sonderblade
Priority: low Keywords:

Created on 2007-08-17 11:24 by sonderblade, last changed 2012-11-07 17:50 by serhiy.storchaka.

Messages (11)
msg32647 - (view) Author: Björn Lindqvist (sonderblade) Date: 2007-08-17 11:24
This code:

import os
import os.path
TARGET='C:/code/python/foo'
base = TARGET
for x in range(200):
    subdirs = os.listdir(base)
    base = os.path.join(base, subdirs[0])
    print base

Produces a TypeError (buffer overflow) when run on a to deeply nested directory for windows to handle:

.. more output here..
C:code/python/foo\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.p
ng\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png
Traceback (most recent call last):
  File "killdir.py", line 6, in <module>
    subdirs = os.listdir(base)
TypeError: listdir() argument 1 must be (buffer overflow), not str
msg32648 - (view) Author: Skip Montanaro (skip.montanaro) * (Python committer) Date: 2007-08-18 11:38
Worked as expected for me on Mac OS X 10.4.10 running from
the trunk (you didn't mention what version you were using).
In ~/tmp/deep I created a maximally nested directory tree from the shell like so:

    cd /Users/skip/tmp/deep
    for i in `range 1000` ; do
        x=`printf %04d $i`
        echo $x
        mkdir $x
        cd $x
    done

where the range command is analogous to Python's range
builtin:

    % range 20
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

The for loop barfed after making directory 0205.

In Python I then executed these statements:

    import os.path
    base = "/Users/skip/tmp/deep"
    for x in range(210):
        subdirs = os.listdir(base)
        base = os.path.join(base, subdirs[0])
        print base

This went until it got to dir 0200 where it raised an
OSError:

    [Errno 63] File name too long: '/Users/skip/tmp/deep/0000/0001/.../0199/0200'

which stands to reason since base was 1025 characters long
at that point.  MAXPATHLEN is defined to be 1024 on my
system, so the OSError is to be expected.

Skip
msg32649 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-08-21 08:18
To rephrase Skip's comment: Can you please report what operating system and Python version you are using?
msg32650 - (view) Author: Björn Lindqvist (sonderblade) Date: 2007-08-21 08:49
Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)] on win32
MS Windows XP, Version 5.1, SP2
msg32651 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-08-22 05:56
Can you please explain what specifically you consider a bug here?

I can see that the error message is confusing, so it could be improved. However, there is nothing we can do to make the error go away. The Microsoft C library simply does not support file names longer than MAX_PATH; you have to use Unicode file names to go beyond this limit.
msg32652 - (view) Author: Björn Lindqvist (sonderblade) Date: 2007-08-22 10:26
Yes, it is the error message and the exception that is the problem. First, it shouldn't raise TypeError (which indicates a programming error), it should raise either IOError, OSError or WindowsError. Second, the exception message is whacky: "listdir() argument 1 must be (buffer overflow), not str" I realize that it is probably impossible to detect this specific error condition but I still want something more explanatory than what it currently is.
msg175102 - (view) Author: Arno Bakker (arno-cs) Date: 2012-11-07 13:28
Can somebody please look at this bug? It still appears in SCons 2.2.0 on Windows 7 when it tries to do a os.listdir on 

C:\Program Files\Microsoft Visual Studio 9.0\VC\ATLMFC\INCLUDE;C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE;C:\Program Files\Microsoft SDKs\Windows\v6.0A\include;\build\libevent-2.0.20-stable-debug\include;\build\libevent-2.0.20-stable-debug\WIN32-Code;\build\gtest-1.4.0\include;
msg175104 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-07 14:01
Can you please report what Python version you are using?
msg175105 - (view) Author: Arno Bakker (arno-cs) Date: 2012-11-07 14:03
This is on Python 2.7.3 on Win7 32-bit, sorry.
msg175108 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-07 16:43
This issue is related to parsing of "et#" format which is used only in listdir() and _getfullpathname() under Windows. PyArg_ParseTuple() throws TypeError exception for multiple conversion errors (in this case it is an overflow of a static buffer). There are several ways to solve this issue:

1. Do nothing, close the issue as "wont fix".  This is just the wrong exception in a very rare case only on 2.7 and only under Windows.  The issue will go away with 2.7.

2. Use under Windows dynamic buffer as under other platforms.  This will require not only dynamic memory allocation, but also reallocation for "\*.*" appending.

3. Do not use PyArg_ParseTuple().  Parse the singular argument manually.

4. If PyArg_ParseTuple() fails then check if the raised exception is TypeError and the error message matches "(buffer overflow)".  In this case raise the right exception.

5. Rewrite PyArg_ParseTuple() so that it will raise an appropriate type of exception (this will have to do anyway, but maybe later, in other issue).  In this case it will be OverflowError.  Then we can catch this error and raise the right exception.

Martin, what is your decision?
msg175114 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-07 17:50
See also issue4071.
History
Date User Action Args
2012-11-07 17:50:46serhiy.storchakasetmessages: + msg175114
2012-11-07 16:43:13serhiy.storchakasetmessages: + msg175108
versions: + Python 2.7, - Python 2.6, Python 3.1
2012-11-07 14:03:28arno-cssetmessages: + msg175105
2012-11-07 14:01:41serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg175104
2012-11-07 13:28:43arno-cssetnosy: + arno-cs
messages: + msg175102
2010-05-20 20:28:56skip.montanarosetnosy: - skip.montanaro
2009-04-07 04:04:14ajaksu2setpriority: normal -> low
stage: test needed
type: behavior
versions: + Python 2.6, Python 3.1
2007-08-17 11:24:22sonderbladecreate