Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.getcwd() hardcodes max path len #53492

Closed
pitrou opened this issue Jul 13, 2010 · 25 comments
Closed

os.getcwd() hardcodes max path len #53492

pitrou opened this issue Jul 13, 2010 · 25 comments
Labels
extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error

Comments

@pitrou
Copy link
Member

pitrou commented Jul 13, 2010

BPO 9246
Nosy @loewis, @terryjreedy, @pitrou, @vstinner, @giampaolo, @zware, @serhiy-storchaka, @worr
Files
  • bigpath.py
  • os_getcwd_buffer-2.patch
  • os_getcwd_maxpathlen.patch
  • bigpath2.py
  • max_getcwd.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-04-24.22:26:40.134>
    created_at = <Date 2010-07-13.12:10:57.992>
    labels = ['extension-modules', 'type-bug']
    title = 'os.getcwd() hardcodes max path len'
    updated_at = <Date 2015-04-24.22:26:40.133>
    user = 'https://github.com/pitrou'

    bugs.python.org fields:

    activity = <Date 2015-04-24.22:26:40.133>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-04-24.22:26:40.134>
    closer = 'vstinner'
    components = ['Extension Modules']
    creation = <Date 2010-07-13.12:10:57.992>
    creator = 'pitrou'
    dependencies = []
    files = ['17983', '18225', '22394', '22395', '39199']
    hgrepos = []
    issue_num = 9246
    keywords = ['patch']
    message_count = 25.0
    messages = ['110177', '110180', '110192', '110194', '110949', '111389', '111391', '111413', '111414', '111737', '111764', '111844', '111848', '138508', '138509', '138510', '138513', '138557', '185235', '185238', '240987', '241695', '241959', '241984', '241985']
    nosy_count = 11.0
    nosy_names = ['loewis', 'terry.reedy', 'pitrou', 'vstinner', 'giampaolo.rodola', 'boya', 'santoso.wijaya', 'python-dev', 'zach.ware', 'serhiy.storchaka', 'worr']
    pr_nums = []
    priority = 'high'
    resolution = 'fixed'
    stage = 'needs patch'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue9246'
    versions = ['Python 3.5']

    @pitrou
    Copy link
    Member Author

    pitrou commented Jul 13, 2010

    In 2.x, os.getcwd() uses a dynamic allocation scheme to accomodate whatever buffer size the current path needs to be represented.

    In 3.x, the max path length is hardcoded to 1026 bytes or characters, and an error is raised if the current path length is larger than that. Even on systems where MAX_PATH is 1024 (a common value), it is still valid to create paths larger than that (using e.g. os.mkdir()).

    The attached script shows that os.getcwd() works with a 1032-character path in 2.x, but fails in 3.x.

    @pitrou pitrou added extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error labels Jul 13, 2010
    @pitrou
    Copy link
    Member Author

    pitrou commented Jul 13, 2010

    Even on systems where MAX_PATH is 1024 (a common value), it is still
    valid to create paths larger than that (using e.g. os.mkdir()).

    It seems I am mistaken on that. MAX_PATH is actually 4096 on the Linux system I am testing on. Calling getcwd() in a path longer than that fails with ENAMETOOLONG.

    Still, 1026 shouldn't be the hard coded max length.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jul 13, 2010

    Just as a reminder: In 2.x, posix_getcwdu() also uses a buffer of size
    1026.

    @pitrou
    Copy link
    Member Author

    pitrou commented Jul 13, 2010

    Just as a reminder: In 2.x, posix_getcwdu() also uses a buffer of size
    1026.

    I suppose the implementation was simply copied into py3k, then.
    Still, it's not a very good idea and it will also be a regression when
    porting scripts from 2.x to 3.x.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jul 20, 2010

    Closed bpo-6817 as a duplicate of this one. There are some patches in
    that issue.

    @terryjreedy
    Copy link
    Member

    On WinXP, 3.1, I get
    ...
    mkdir: 242
    Traceback (most recent call last):
      File "C:\Programs\Python31\misc\t1.py", line 14, in <module>
        os.mkdir(s)
    WindowsError: [Error 206] The filename or extension is too long: 'C:\\Programs\\Python31\\misc\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab'

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jul 23, 2010

    Terry J. Reedy <report@bugs.python.org> wrote:
    > mkdir: 242
    > Traceback (most recent call last):
    >   File "C:\Programs\Python31\misc\t1.py", line 14, in <module>
    >     os.mkdir(s)
    > WindowsError: [Error 206] The filename or extension is too long: 'C:\\Programs\\Python31\\misc\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab\\aaaaab'

    On Windows MAX_PATH seems to be 260 characters:

    http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx

    @vstinner
    Copy link
    Member

    Patch based on Python 2 source code, but raises a MemoryError (instead of an OSError) on memory allocation failure.

    With my patch, bigpath.py ends with "cwd: 1028 ...aab/aaaaab" with Python Python 3.2. Same result with Python 2.6. 1028 is bigger than 1026 (previous hardcoded max length in bytes including nul byte).

    @vstinner
    Copy link
    Member

    I'm not sure that PyMem_Realloc(NULL, size) is always equivalent to PyMem_Malloc(size). And I don't really know why I'm using PyMem_* instead of malloc() / free() :-) I suppose that Python has a faster memory allocator, or that it has better checks when compiled with pydebug?

    @pitrou
    Copy link
    Member Author

    pitrou commented Jul 27, 2010

    It's not ok to call PyMem_* functions when the GIL is released. You should only release the GIL around the call to the system getcwd().

    I suppose that Python has a faster memory allocator, or that it has
    better checks when compiled with pydebug?

    In this case it doesn't really make a difference, since all allocations larger than 256 bytes are delegated to the system allocator.

    @vstinner
    Copy link
    Member

    New version of the patch avoiding PyMem_*() functions to avoid a possible race condition (issue with the GIL).

    @vstinner
    Copy link
    Member

    Antoine asked me why not using a buffer of MAX_PATH+1 (instead of a dynamic buffer size). I don't know, I just copied/pasted the code from Python2. Extract of getcwd() manpage:

    Note that on some systems, PATH_MAX may not be a compile-time
    constant; furthermore, its value may depend on the file system,
    see pathconf(3).

    It's maybe to support strange OS like Hurd :-) (Hurd has no hardcoded limits).

    Most of the time, the first realloc() should be enough.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jul 28, 2010

    For 2.x, unlimited path lengths were apparently introduced in bpo-2722.
    This strategy does not work on Solaris and OpenBSD (bpo-9185).

    FreeBSD also seems to support arbitrarily long paths. I would be somewhat
    surprised though if anyone used them in practice. APUE (second edition)
    uses PATH_MAX if it's available in limits.h.

    @vstinner
    Copy link
    Member

    Simpler patch replacing 1026 constant by MAXPATHLEN. On my Linux box, MAXPATHLEN is 4096 and os.pathconf('/', 'PC_PATH_MAX') returns 4096. I am able to get a path of 4095 bytes using the patch.

    @vstinner
    Copy link
    Member

    You may use get_current_dir_name() which allocates the memory for us.

    I can adapt os_getcwd_buffer-2.patch to support Solaris/OpenBSD, but do we need a dynamic buffer? (do we need to support OS without PATH_MAX)

    @vstinner
    Copy link
    Member

    bigpath2.py: script to check the maximum path length of os.getcwd().

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jun 17, 2011

    I can adapt os_getcwd_buffer-2.patch to support Solaris/OpenBSD, but
    do we need a dynamic buffer? (do we need to support OS without
    PATH_MAX)

    From a practicality point of view, we need to make no change at all:
    nobody sane ever has a current working directory path of more than
    1000 characters. Even if people have very long path names, they
    don't make them the current working directory.

    So if anything is changed, it's for purity only. Then, for purity,
    we should get it right and support any path that the operating system
    supports.

    @pitrou
    Copy link
    Member Author

    pitrou commented Jun 17, 2011

    From a practicality point of view, we need to make no change at all:
    nobody sane ever has a current working directory path of more than
    1000 characters. Even if people have very long path names, they
    don't make them the current working directory.

    I don't see why that wouldn't be the case. They probably don't change to these directories *by hand*, but they can have scripts that cd into such a directory before doing stuff inside it. And these scripts can be written in Python.

    @vstinner
    Copy link
    Member

    For the record, os.getcwd() of Python 2 was improved by 96adf96d861a (issue bpo-2722) to use a dynamic buffer.

    @vstinner
    Copy link
    Member

    test_posix.py of Python 3 contains the test, but the test is "disabled" (dead code):

    def test_getcwd_long_pathnames(self):
        if hasattr(posix, 'getcwd'):
            dirname = 'getcwd-test-directory-0123456789abcdef-01234567890abcdef'
            curdir = os.getcwd()
            base_path = os.path.abspath(support.TESTFN) + '.getcwd'
        try:
            os.mkdir(base_path)
            os.chdir(base_path)
        except:
    

    # Just returning nothing instead of the SkipTest exception,
    # because the test results in Error in that case.
    # Is that ok?
    # raise unittest.SkipTest("cannot create directory for testing")
    return

                def _create_and_do_getcwd(dirname, current_path_length = 0):
                    try:
                        os.mkdir(dirname)
                    except:
                        raise unittest.SkipTest("mkdir cannot create directory sufficiently deep for getcwd test")
    
                    os.chdir(dirname)
                    try:
                        os.getcwd()
                        if current_path_length < 1027:
                            _create_and_do_getcwd(dirname, current_path_length + len(dirname) + 1)
                    finally:
                        os.chdir('..')
                        os.rmdir(dirname)
    
                _create_and_do_getcwd(dirname)
        finally:
            os.chdir(curdir)
            support.rmtree(base_path)
    

    See the issue bpo-17516 for removal of dead code.

    @worr
    Copy link
    Mannequin

    worr mannequin commented Apr 14, 2015

    Revisiting this, I've updated python3 to calculate this and use gradual dynamic allocation like the python2 implementation.

    @worr
    Copy link
    Mannequin

    worr mannequin commented Apr 21, 2015

    I've incorporated some of the feedback from the reviews into this new patch. I used the PyMem_Raw* functions to do allocation to avoid having to acquire the GIL and also avoid complciations from the builtin memory allocator, since I'm not using python objects.

    I have also fixed a memory leak in my original patch, as well as a case where OSes with a small MAX_PATH fail with ENAMETOOLONG

    @worr
    Copy link
    Mannequin

    worr mannequin commented Apr 24, 2015

    I've updated the patch with the comments from the review

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 24, 2015

    New changeset abf1f3ae4fa8 by Victor Stinner in branch '3.4':
    Issue bpo-9246: On POSIX, os.getcwd() now supports paths longer than 1025 bytes
    https://hg.python.org/cpython/rev/abf1f3ae4fa8

    New changeset b871ace5c58f by Victor Stinner in branch 'default':
    (Merge 3.4) Issue bpo-9246: On POSIX, os.getcwd() now supports paths longer than
    https://hg.python.org/cpython/rev/b871ace5c58f

    @vstinner
    Copy link
    Member

    I've updated the patch with the comments from the review

    Thanks William for your contribution, I commited your fix.

    I just made a minor change on "if (cwd && use_bytes) {": you forgot to remove test now useless test on cwd, and I dropped { and } to make to short more readable (Note: the PEP-7 requires to put "}" and "else {" on two lines).

    Ok, I just took 5 years to get Python 2 features in Python 3 :-)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants