Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.getcwdb() doesn't implement PEP 528 (UTF-8) on Windows #81593

Closed
vstinner opened this issue Jun 26, 2019 · 21 comments
Closed

os.getcwdb() doesn't implement PEP 528 (UTF-8) on Windows #81593

vstinner opened this issue Jun 26, 2019 · 21 comments
Assignees
Labels
3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir

Comments

@vstinner
Copy link
Member

BPO 37412
Nosy @vstinner, @eryksun, @zooba, @ZackerySpytz, @miss-islington
PRs
  • bpo-37412: os.getcwdb() now uses UTF-8 on Windows #14396
  • [3.8] bpo-37412: os.getcwdb() now uses UTF-8 on Windows (GH-14396) #14399
  • bpo-37412: Fix os.getcwd() for long path on Windows #14424
  • bpo-37412: pythoninfo: add Windows long paths #14434
  • [3.8] bpo-37412: Fix os.getcwd() for long path on Windows (GH-14424) #14451
  • bpo-37412: Fix test_os.test_getcwd_long_path() on macOS #14452
  • [3.8] bpo-37412: Fix test_os.test_getcwd_long_path() on macOS (GH-14452) #14454
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/vstinner'
    closed_at = <Date 2019-07-01.16:15:37.239>
    created_at = <Date 2019-06-26.14:28:07.396>
    labels = ['3.8', 'library', '3.9']
    title = "os.getcwdb() doesn't implement PEP 528 (UTF-8) on Windows"
    updated_at = <Date 2019-07-01.16:30:21.942>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2019-07-01.16:30:21.942>
    actor = 'vstinner'
    assignee = 'vstinner'
    closed = True
    closed_date = <Date 2019-07-01.16:15:37.239>
    closer = 'steve.dower'
    components = ['Library (Lib)']
    creation = <Date 2019-06-26.14:28:07.396>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 37412
    keywords = ['patch']
    message_count = 21.0
    messages = ['346620', '346621', '346622', '346626', '346628', '346629', '346636', '346648', '346672', '346708', '346743', '346825', '346826', '346832', '346833', '346834', '346835', '346843', '346847', '347037', '347041']
    nosy_count = 5.0
    nosy_names = ['vstinner', 'eryksun', 'steve.dower', 'ZackerySpytz', 'miss-islington']
    pr_nums = ['14396', '14399', '14424', '14434', '14451', '14452', '14454']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue37412'
    versions = ['Python 3.8', 'Python 3.9']

    @vstinner
    Copy link
    Member Author

    On Windows, os.getcwdb() is implemented using getcwd() on Windows. This function uses the ANSI code page, whereas PEP-529 "Change Windows filesystem encoding to UTF-8" is supposed to use UTF-8 for all bytes paths and filenames. Moreover, this function emits a DeprecationWarning, whereas PEP-529 was supposed to avoid the need to deprecated bytes paths and filenames on Windows.

    I guess that it was forgotten in the implementation of the PEP-529. Or was it a deliberate choice?

    Attached PR modify os.getcwdb() to use UTF-8.

    @vstinner vstinner added 3.9 only security fixes stdlib Python modules in the Lib dir labels Jun 26, 2019
    @vstinner
    Copy link
    Member Author

    My PR 14396 modify os.getcwdb() to use UTF-8 and it removes the deprecation warning.

    @ZackerySpytz
    Copy link
    Mannequin

    ZackerySpytz mannequin commented Jun 26, 2019

    See also bpo-32920.

    @zooba
    Copy link
    Member

    zooba commented Jun 26, 2019

    Definitely just an oversight. Thanks for fixing this!

    I think it can go to 3.8 as well. Thoughts?

    @vstinner
    Copy link
    Member Author

    I think it can go to 3.8 as well. Thoughts?

    Oh, 3.8 is not released yet. Well, it's more or less a new feature, or a bugfix, depending on the point of view.

    I would not be comfortable to change the encoding in 3.8.1, but it should be fine to do the change in 3.8.

    @vstinner
    Copy link
    Member Author

    I retargeted my PR 14396 to Python 3.8.

    @vstinner vstinner added 3.8 only security fixes and removed 3.9 only security fixes labels Jun 26, 2019
    @vstinner
    Copy link
    Member Author

    New changeset 689830e by Victor Stinner in branch 'master':
    bpo-37412: os.getcwdb() now uses UTF-8 on Windows (GH-14396)
    689830e

    @miss-islington
    Copy link
    Contributor

    New changeset 63429c8 by Miss Islington (bot) in branch '3.8':
    bpo-37412: os.getcwdb() now uses UTF-8 on Windows (GH-14396)
    63429c8

    @vstinner
    Copy link
    Member Author

    Ok, it's merged into 3.8. Thanks for the review Steve. One less deprecation warning.

    @vstinner vstinner added the 3.9 only security fixes label Jun 26, 2019
    @eryksun
    Copy link
    Contributor

    eryksun commented Jun 27, 2019

    This update breaks long-path support in Windows. It includes the following unnecessary check, which is using the wrong comparison operator:

        if (len >= PY_SSIZE_T_MAX / sizeof(wchar_t))
    

    PyMem_RawMalloc already checks this and returns NULL if size > (size_t)PY_SSIZE_T_MAX. This bug is causing a MemoryError with long paths:

        >>> p = 'C:/Temp/longpath' + ('/' + 'a' * 255) * 9
        >>> os.chdir(p)
        >>> len(os.getcwd())
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        MemoryError

    @vstinner
    Copy link
    Member Author

    This update breaks long-path support in Windows.

    Oops.

    This is the risk of not testing a feature :-) My PR added very basic tests for getcwd() and getcwdb(). I wrote PR 14424 to not only fix the integer overflow check, but also to add a new unit tests for "long path".

    @vstinner vstinner reopened this Jun 27, 2019
    @vstinner
    Copy link
    Member Author

    New changeset ec3e20a by Victor Stinner in branch 'master':
    bpo-37412: Fix os.getcwd() for long path on Windows (GH-14424)
    ec3e20a

    @vstinner
    Copy link
    Member Author

    New changeset 64580da by Victor Stinner in branch 'master':
    bpo-37412: pythoninfo: add Windows long paths (GH-14434)
    64580da

    @vstinner
    Copy link
    Member Author

    New changeset 68c1c39 by Victor Stinner (Miss Islington (bot)) in branch '3.8':
    bpo-37412: Fix os.getcwd() for long path on Windows (GH-14424) (GH-14451)
    68c1c39

    @vstinner
    Copy link
    Member Author

    Ok, the regression should now be fixed and test_os now also tests os.getcwd() with a "long path" :-) Thank a lot Eryk Sun for your great advices on reviews!

    @vstinner
    Copy link
    Member Author

    Oh. Now the test fails on macOS:

    https://buildbot.python.org/all/#/builders/145/builds/2021
    ======================================================================
    FAIL: test_getcwd_long_path (test.test_os.MiscTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/Users/buildbot/buildarea/3.x.billenstein-sierra/build/Lib/test/test_os.py", line 115, in test_getcwd_long_path
        self.assertEqual(cwd, expected)
    AssertionError: '/private/var/folders/sy/9hwmqyx14s11577cvgzwf2v40000gt/T/tmpchlt91ay' != '/var/folders/sy/9hwmqyx14s11577cvgzwf2v40000gt/T/tmpchlt91ay'
    - /private/var/folders/sy/9hwmqyx14s11577cvgzwf2v40000gt/T/tmpchlt91ay
    ? 

    + /var/folders/sy/9hwmqyx14s11577cvgzwf2v40000gt/T/tmpchlt91ay

    @vstinner vstinner reopened this Jun 28, 2019
    @vstinner
    Copy link
    Member Author

    Fail on FreeBSD as well:

    https://buildbot.python.org/all/#/builders/168/builds/1222

    ======================================================================
    FAIL: test_getcwd_long_path (test.test_os.MiscTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/test_os.py", line 115, in test_getcwd_long_path
        self.assertEqual(cwd, expected)
    AssertionError: '/var/tmp/tmpq45z_od3' != '/tmp/tmpq45z_od3'
    - /var/tmp/tmpq45z_od3
    ? 

    + /tmp/tmpq45z_od3

    @vstinner
    Copy link
    Member Author

    New changeset 29f609e by Victor Stinner in branch 'master':
    bpo-37412: Fix test_os.test_getcwd_long_path() on macOS (GH-14452)
    29f609e

    @miss-islington
    Copy link
    Contributor

    New changeset e3761ca by Miss Islington (bot) in branch '3.8':
    bpo-37412: Fix test_os.test_getcwd_long_path() on macOS (GH-14452)
    e3761ca

    @zooba
    Copy link
    Member

    zooba commented Jul 1, 2019

    This is fixed now, right? The buildbots seem happy at least. I'll close.

    @zooba zooba closed this as completed Jul 1, 2019
    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 1, 2019

    This is fixed now, right? The buildbots seem happy at least. I'll close.

    Yeah, I forgot to close. I got some issues with backports, but the two commits are now merged into 3.8 as well.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants