classification
Title: os.getcwd() hardcodes max path len
Type: behavior Stage: needs patch
Components: Extension Modules Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: boya, giampaolo.rodola, haypo, loewis, pitrou, python-dev, santoso.wijaya, serhiy.storchaka, terry.reedy, worr, zach.ware
Priority: high Keywords: patch

Created on 2010-07-13 12:10 by pitrou, last changed 2015-04-24 22:26 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
bigpath.py pitrou, 2010-07-13 12:10
os_getcwd_buffer-2.patch haypo, 2010-07-28 02:05
os_getcwd_maxpathlen.patch haypo, 2011-06-17 13:55 review
bigpath2.py haypo, 2011-06-17 14:00
max_getcwd.patch worr, 2015-04-24 17:10 review
Messages (25)
msg110177 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-13 12:10
In 2.x, os.getcwd() uses a dynamic allocation scheme to accomodate whatever buffer size the current path needs to be represented.

In 3.x, the max path length is hardcoded to 1026 bytes or characters, and an error is raised if the current path length is larger than that. Even on systems where MAX_PATH is 1024 (a common value), it is still valid to create paths larger than that (using e.g. os.mkdir()).

The attached script shows that os.getcwd() works with a 1032-character path in 2.x, but fails in 3.x.
msg110180 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-13 12:29
> Even on systems where MAX_PATH is 1024 (a common value), it is still
> valid to create paths larger than that (using e.g. os.mkdir()).

It seems I am mistaken on that. MAX_PATH is actually 4096 on the Linux system I am testing on. Calling getcwd() in a path longer than that fails with ENAMETOOLONG.

Still, 1026 shouldn't be the hard coded max length.
msg110192 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-07-13 13:43
Just as a reminder: In 2.x, posix_getcwdu() also uses a buffer of size
1026.
msg110194 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-13 13:56
> Just as a reminder: In 2.x, posix_getcwdu() also uses a buffer of size
> 1026.

I suppose the implementation was simply copied into py3k, then.
Still, it's not a very good idea and it will also be a regression when
porting scripts from 2.x to 3.x.
msg110949 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-07-20 19:00
Closed issue 6817 as a duplicate of this one. There are some patches in
that issue.
msg111389 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-07-23 21:33
On WinXP, 3.1, I get
...
mkdir: 242
Traceback (most recent call last):
  File "C:\Programs\Python31\misc\t1.py", line 14, in <module>
    os.mkdir(s)
WindowsError: [Error 206] The filename or extension is too long: 'C:\\Programs\\Python31\\misc\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab'
msg111391 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-07-23 21:43
Terry J. Reedy <report@bugs.python.org> wrote:
> mkdir: 242
> Traceback (most recent call last):
>   File "C:\Programs\Python31\misc\t1.py", line 14, in <module>
>     os.mkdir(s)
> WindowsError: [Error 206] The filename or extension is too long: 'C:\\Programs\\Python31\\misc\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab'

On Windows MAX_PATH seems to be 260 characters:

http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx
msg111413 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-07-24 02:05
Patch based on Python 2 source code, but raises a MemoryError (instead of an OSError) on memory allocation failure.

With my patch, bigpath.py ends with "cwd: 1028 ...aab/aaaaab" with Python Python 3.2. Same result with Python 2.6. 1028 is bigger than 1026 (previous hardcoded max length in bytes including nul byte).
msg111414 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-07-24 02:08
I'm not sure that PyMem_Realloc(NULL, size) is always equivalent to PyMem_Malloc(size). And I don't really know why I'm using PyMem_* instead of malloc() / free() :-) I suppose that Python has a faster memory allocator, or that it has better checks when compiled with pydebug?
msg111737 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-27 22:20
It's not ok to call PyMem_* functions when the GIL is released. You should only release the GIL around the call to the system getcwd().

> I suppose that Python has a faster memory allocator, or that it has
> better checks when compiled with pydebug?

In this case it doesn't really make a difference, since all allocations larger than 256 bytes are delegated to the system allocator.
msg111764 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-07-28 02:05
New version of the patch avoiding PyMem_*() functions to avoid a possible race condition (issue with the GIL).
msg111844 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-07-28 20:48
Antoine asked me why not using a buffer of MAX_PATH+1 (instead of a dynamic buffer size). I don't know, I just copied/pasted the code from Python2. Extract of getcwd() manpage:

   Note that on some systems, PATH_MAX may not be a compile-time
   constant; furthermore, its value may depend  on  the file system,
   see pathconf(3).

It's maybe to support strange OS like Hurd :-) (Hurd has no hardcoded limits).

Most of the time, the first realloc() should be enough.
msg111848 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-07-28 21:00
For 2.x, unlimited path lengths were apparently introduced in issue 2722.
This strategy does not work on Solaris and OpenBSD (issue 9185).

FreeBSD also seems to support arbitrarily long paths. I would be somewhat
surprised though if anyone used them in practice. APUE (second edition)
uses PATH_MAX if it's available in limits.h.
msg138508 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-17 13:55
Simpler patch replacing 1026 constant by MAXPATHLEN. On my Linux box, MAXPATHLEN is 4096 and os.pathconf('/', 'PC_PATH_MAX') returns 4096. I am able to get a path of 4095 bytes using the patch.
msg138509 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-17 13:58
You may use get_current_dir_name() which allocates the memory for us.

I can adapt os_getcwd_buffer-2.patch to support Solaris/OpenBSD, but do we need a dynamic buffer? (do we need to support OS without PATH_MAX)
msg138510 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-17 14:00
bigpath2.py: script to check the maximum path length of os.getcwd().
msg138513 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-06-17 15:04
> I can adapt os_getcwd_buffer-2.patch to support Solaris/OpenBSD, but
> do we need a dynamic buffer? (do we need to support OS without
> PATH_MAX)

From a practicality point of view, we need to make no change at all:
nobody sane ever has a current working directory path of more than
1000 characters. Even if people have very long path names, they
don't make them the current working directory.

So if anything is changed, it's for purity only. Then, for purity,
we should get it right and support any path that the operating system
supports.
msg138557 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-06-17 22:57
> From a practicality point of view, we need to make no change at all:
> nobody sane ever has a current working directory path of more than
> 1000 characters. Even if people have very long path names, they
> don't make them the current working directory.

I don't see why that wouldn't be the case. They probably don't change to these directories *by hand*, but they can have scripts that cd into such a directory before doing stuff inside it. And these scripts can be written in Python.
msg185235 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-03-25 21:29
For the record, os.getcwd() of Python 2 was improved by 96adf96d861a (issue #2722) to use a dynamic buffer.
msg185238 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-03-25 21:31
test_posix.py of Python 3 contains the test, but the test is "disabled" (dead code):

def test_getcwd_long_pathnames(self):
    if hasattr(posix, 'getcwd'):
        dirname = 'getcwd-test-directory-0123456789abcdef-01234567890abcdef'
        curdir = os.getcwd()
        base_path = os.path.abspath(support.TESTFN) + '.getcwd'

        try:
            os.mkdir(base_path)
            os.chdir(base_path)
        except:
#           Just returning nothing instead of the SkipTest exception,
#           because the test results in Error in that case.
#           Is that ok?
#            raise unittest.SkipTest("cannot create directory for testing")
            return

            def _create_and_do_getcwd(dirname, current_path_length = 0):
                try:
                    os.mkdir(dirname)
                except:
                    raise unittest.SkipTest("mkdir cannot create directory sufficiently deep for getcwd test")

                os.chdir(dirname)
                try:
                    os.getcwd()
                    if current_path_length < 1027:
                        _create_and_do_getcwd(dirname, current_path_length + len(dirname) + 1)
                finally:
                    os.chdir('..')
                    os.rmdir(dirname)

            _create_and_do_getcwd(dirname)

        finally:
            os.chdir(curdir)
            support.rmtree(base_path)

See the issue #17516 for removal of dead code.
msg240987 - (view) Author: William Orr (worr) * Date: 2015-04-14 18:54
Revisiting this, I've updated python3 to calculate this and use gradual dynamic allocation like the python2 implementation.
msg241695 - (view) Author: William Orr (worr) * Date: 2015-04-21 02:38
I've incorporated some of the feedback from the reviews into this new patch. I used the PyMem_Raw* functions to do allocation to avoid having to acquire the GIL and also avoid complciations from the builtin memory allocator, since I'm not using python objects.

I have also fixed a memory leak in my original patch, as well as a case where OSes with a small MAX_PATH fail with ENAMETOOLONG
msg241959 - (view) Author: William Orr (worr) * Date: 2015-04-24 17:10
I've updated the patch with the comments from the review
msg241984 - (view) Author: Roundup Robot (python-dev) Date: 2015-04-24 22:23
New changeset abf1f3ae4fa8 by Victor Stinner in branch '3.4':
Issue #9246: On POSIX, os.getcwd() now supports paths longer than 1025 bytes
https://hg.python.org/cpython/rev/abf1f3ae4fa8

New changeset b871ace5c58f by Victor Stinner in branch 'default':
(Merge 3.4) Issue #9246: On POSIX, os.getcwd() now supports paths longer than
https://hg.python.org/cpython/rev/b871ace5c58f
msg241985 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2015-04-24 22:26
> I've updated the patch with the comments from the review

Thanks William for your contribution, I commited your fix.

I just made a minor change on "if (cwd && use_bytes) {": you forgot to remove test now useless test on cwd, and I dropped { and } to make to short more readable (Note: the PEP 7 requires to put "}" and "else {" on two lines).

Ok, I just took 5 years to get Python 2 features in Python 3 :-)
History
Date User Action Args
2015-04-24 22:26:40hayposetstatus: open -> closed
resolution: fixed
messages: + msg241985
2015-04-24 22:23:57python-devsetnosy: + python-dev
messages: + msg241984
2015-04-24 17:10:35worrsetfiles: + max_getcwd.patch

messages: + msg241959
2015-04-24 17:04:50worrsetfiles: - max_getcwd.patch
2015-04-24 17:04:47worrsetfiles: - max_getcwd.patch
2015-04-21 02:38:15worrsetfiles: + max_getcwd.patch

messages: + msg241695
2015-04-14 18:54:49worrsetfiles: + max_getcwd.patch
versions: + Python 3.5, - Python 2.7, Python 3.2, Python 3.3
nosy: + worr

messages: + msg240987
2014-10-14 16:22:33skrahsetnosy: - skrah
2013-11-20 15:39:52zach.waresetnosy: + zach.ware
2013-03-25 21:31:19hayposetmessages: + msg185238
2013-03-25 21:29:31hayposetmessages: + msg185235
2012-03-31 19:28:14serhiy.storchakasetnosy: + serhiy.storchaka
2011-06-18 18:52:15giampaolo.rodolasetnosy: + giampaolo.rodola
2011-06-17 22:57:55pitrousetmessages: + msg138557
2011-06-17 18:55:00santoso.wijayasetnosy: + santoso.wijaya
2011-06-17 15:04:05loewissetmessages: + msg138513
2011-06-17 14:00:06hayposetfiles: + bigpath2.py

messages: + msg138510
2011-06-17 13:58:43hayposetmessages: + msg138509
2011-06-17 13:55:17hayposetfiles: + os_getcwd_maxpathlen.patch

messages: + msg138508
2011-06-12 18:39:04terry.reedysetversions: + Python 3.3, - Python 3.1
2010-07-28 21:00:49skrahsetmessages: + msg111848
2010-07-28 20:48:10hayposetmessages: + msg111844
2010-07-28 02:05:26hayposetfiles: - os_getcwd_buffer.patch
2010-07-28 02:05:18hayposetfiles: + os_getcwd_buffer-2.patch

messages: + msg111764
2010-07-27 22:20:38pitrousetmessages: + msg111737
2010-07-24 02:08:59hayposetmessages: + msg111414
2010-07-24 02:05:54hayposetfiles: + os_getcwd_buffer.patch
keywords: + patch
messages: + msg111413
2010-07-23 21:43:39skrahsetmessages: + msg111391
2010-07-23 21:33:01terry.reedysetnosy: + terry.reedy
messages: + msg111389
2010-07-20 19:00:33skrahsetnosy: + boya
messages: + msg110949
2010-07-20 18:58:57skrahlinkissue6817 superseder
2010-07-13 14:08:04r.david.murraysetpriority: normal -> high
2010-07-13 13:56:52pitrousetmessages: + msg110194
2010-07-13 13:43:52skrahsetmessages: + msg110192
versions: + Python 2.7
2010-07-13 12:29:28pitrousetmessages: + msg110180
2010-07-13 12:10:58pitroucreate