Title: Buffer overflow when listing deeply nested directory
Type: behavior Stage: resolved
Components: Windows Versions: Python 2.7
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: ZackerySpytz, arno-cs, benjamin.peterson, eryksun, serhiy.storchaka, sonderblade
Priority: low Keywords:

Created on 2007-08-17 11:24 by sonderblade, last changed 2020-05-14 21:21 by benjamin.peterson. This issue is now closed.

Messages (15)
msg32647 - (view) Author: Björn Lindqvist (sonderblade) Date: 2007-08-17 11:24
This code:

import os
import os.path
base = TARGET
for x in range(200):
    subdirs = os.listdir(base)
    base = os.path.join(base, subdirs[0])
    print base

Produces a TypeError (buffer overflow) when run on a to deeply nested directory for windows to handle:

.. more output here..
C:code/python/foo\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.p
ng\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png\foo bar.png
Traceback (most recent call last):
  File "", line 6, in <module>
    subdirs = os.listdir(base)
TypeError: listdir() argument 1 must be (buffer overflow), not str
msg32648 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2007-08-18 11:38
Worked as expected for me on Mac OS X 10.4.10 running from
the trunk (you didn't mention what version you were using).
In ~/tmp/deep I created a maximally nested directory tree from the shell like so:

    cd /Users/skip/tmp/deep
    for i in `range 1000` ; do
        x=`printf %04d $i`
        echo $x
        mkdir $x
        cd $x

where the range command is analogous to Python's range

    % range 20
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

The for loop barfed after making directory 0205.

In Python I then executed these statements:

    import os.path
    base = "/Users/skip/tmp/deep"
    for x in range(210):
        subdirs = os.listdir(base)
        base = os.path.join(base, subdirs[0])
        print base

This went until it got to dir 0200 where it raised an

    [Errno 63] File name too long: '/Users/skip/tmp/deep/0000/0001/.../0199/0200'

which stands to reason since base was 1025 characters long
at that point.  MAXPATHLEN is defined to be 1024 on my
system, so the OSError is to be expected.

msg32649 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-08-21 08:18
To rephrase Skip's comment: Can you please report what operating system and Python version you are using?
msg32650 - (view) Author: Björn Lindqvist (sonderblade) Date: 2007-08-21 08:49
Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)] on win32
MS Windows XP, Version 5.1, SP2
msg32651 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-08-22 05:56
Can you please explain what specifically you consider a bug here?

I can see that the error message is confusing, so it could be improved. However, there is nothing we can do to make the error go away. The Microsoft C library simply does not support file names longer than MAX_PATH; you have to use Unicode file names to go beyond this limit.
msg32652 - (view) Author: Björn Lindqvist (sonderblade) Date: 2007-08-22 10:26
Yes, it is the error message and the exception that is the problem. First, it shouldn't raise TypeError (which indicates a programming error), it should raise either IOError, OSError or WindowsError. Second, the exception message is whacky: "listdir() argument 1 must be (buffer overflow), not str" I realize that it is probably impossible to detect this specific error condition but I still want something more explanatory than what it currently is.
msg175102 - (view) Author: Arno Bakker (arno-cs) Date: 2012-11-07 13:28
Can somebody please look at this bug? It still appears in SCons 2.2.0 on Windows 7 when it tries to do a os.listdir on 

C:\Program Files\Microsoft Visual Studio 9.0\VC\ATLMFC\INCLUDE;C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE;C:\Program Files\Microsoft SDKs\Windows\v6.0A\include;\build\libevent-2.0.20-stable-debug\include;\build\libevent-2.0.20-stable-debug\WIN32-Code;\build\gtest-1.4.0\include;
msg175104 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-07 14:01
Can you please report what Python version you are using?
msg175105 - (view) Author: Arno Bakker (arno-cs) Date: 2012-11-07 14:03
This is on Python 2.7.3 on Win7 32-bit, sorry.
msg175108 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-07 16:43
This issue is related to parsing of "et#" format which is used only in listdir() and _getfullpathname() under Windows. PyArg_ParseTuple() throws TypeError exception for multiple conversion errors (in this case it is an overflow of a static buffer). There are several ways to solve this issue:

1. Do nothing, close the issue as "wont fix".  This is just the wrong exception in a very rare case only on 2.7 and only under Windows.  The issue will go away with 2.7.

2. Use under Windows dynamic buffer as under other platforms.  This will require not only dynamic memory allocation, but also reallocation for "\*.*" appending.

3. Do not use PyArg_ParseTuple().  Parse the singular argument manually.

4. If PyArg_ParseTuple() fails then check if the raised exception is TypeError and the error message matches "(buffer overflow)".  In this case raise the right exception.

5. Rewrite PyArg_ParseTuple() so that it will raise an appropriate type of exception (this will have to do anyway, but maybe later, in other issue).  In this case it will be OverflowError.  Then we can catch this error and raise the right exception.

Martin, what is your decision?
msg175114 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-07 17:50
See also issue4071.
msg224324 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-07-30 16:42
I suggest we close this as "won't fix" since I don't see how we can justify spending time working around a known limitation of Windows.
msg340965 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-04-27 04:44
Benjamin, what of the proposed options do you prefer?
msg340998 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-04-27 16:26
In Windows 7, FindFirstFileA uses a per-thread static buffer to decode the input bytes path to Unicode. This buffer limits the length to 259 characters (MAX_PATH - 1), even if a "\\?\" device path is used. Windows 8+ uses a dynamic buffer, but I don't see the point of switching to a  dynamic buffer on our side given Windows 7 is still so widely used and the documentation still requires Unicode for long "\\?\" paths. 

Ideally, I think 2.7 should raise the same exception as 3.5 does in this case [1]. For example:

    >>> os.listdir(long_bytes_path)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: listdir: path too long for Windows

msg368868 - (view) Author: Zackery Spytz (ZackerySpytz) * (Python triager) Date: 2020-05-14 21:18
Python 2 is EOL.
Date User Action Args
2020-05-14 21:21:04benjamin.petersonsetstatus: open -> closed
resolution: wont fix
stage: test needed -> resolved
2020-05-14 21:18:57ZackerySpytzsetnosy: + ZackerySpytz
messages: + msg368868
2019-04-27 16:26:25eryksunsetnosy: + eryksun
messages: + msg340998
2019-04-27 04:44:50serhiy.storchakasetnosy: + benjamin.peterson
messages: + msg340965
2019-04-26 20:14:07BreamoreBoysetnosy: - BreamoreBoy
2014-07-30 16:55:20loewissetnosy: - loewis
2014-07-30 16:42:08BreamoreBoysetnosy: + BreamoreBoy
messages: + msg224324
2012-11-07 17:50:46serhiy.storchakasetmessages: + msg175114
2012-11-07 16:43:13serhiy.storchakasetmessages: + msg175108
versions: + Python 2.7, - Python 2.6, Python 3.1
2012-11-07 14:03:28arno-cssetmessages: + msg175105
2012-11-07 14:01:41serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg175104
2012-11-07 13:28:43arno-cssetnosy: + arno-cs
messages: + msg175102
2010-05-20 20:28:56skip.montanarosetnosy: - skip.montanaro
2009-04-07 04:04:14ajaksu2setpriority: normal -> low
stage: test needed
type: behavior
versions: + Python 2.6, Python 3.1
2007-08-17 11:24:22sonderbladecreate