Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escape the literal part of the path for glob() #85215

Closed
serhiy-storchaka opened this issue Jun 19, 2020 · 6 comments
Closed

Escape the literal part of the path for glob() #85215

serhiy-storchaka opened this issue Jun 19, 2020 · 6 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member

BPO 41043
Nosy @encukou, @serhiy-storchaka
PRs
  • bpo-41043: Escape literal part of the path for glob(). #20994
  • [3.9] bpo-41043: Escape literal part of the path for glob(). (GH-20994). #21275
  • [3.8] bpo-41043: Escape literal part of the path for glob(). (GH-20994). #21277
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-07-02.07:06:05.589>
    created_at = <Date 2020-06-19.21:16:43.723>
    labels = ['type-bug', '3.8', '3.9', '3.10', '3.7', 'library']
    title = 'Escape the literal part of the path for glob()'
    updated_at = <Date 2020-07-02.07:06:05.588>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2020-07-02.07:06:05.588>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-07-02.07:06:05.589>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2020-06-19.21:16:43.723>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 41043
    keywords = ['patch']
    message_count = 6.0
    messages = ['371903', '371923', '372793', '372805', '372808', '372809']
    nosy_count = 2.0
    nosy_names = ['petr.viktorin', 'serhiy.storchaka']
    pr_nums = ['20994', '21275', '21277']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue41043'
    versions = ['Python 3.7', 'Python 3.8', 'Python 3.9', 'Python 3.10']

    @serhiy-storchaka
    Copy link
    Member Author

    It is common to use glob() as

        glob.glob(os.path.join(basedir, pattern))

    But it does not work correctly if the base directory contains special globbing characters ('*', '?', '['). It is an uncommon case, so in most cases the code works. But when move sources to the directory containing special characters, built it and run tests, some tests will fail:

    test test_tokenize failed -- Traceback (most recent call last):
      File "/home/serhiy/py/[cpython]/Lib/test/test_tokenize.py", line 1615, in test_random_files
        testfiles.remove(os.path.join(tempdir, "test_unicode_identifiers.py"))
    ValueError: list.remove(x): x not in list
    
    test test_multiprocessing_fork failed -- Traceback (most recent call last):
      File "/home/serhiy/py/[cpython]/Lib/test/_test_multiprocessing.py", line 4272, in test_import
        modules = self.get_module_names()
      File "/home/serhiy/py/[cpython]/Lib/test/_test_multiprocessing.py", line 4267, in get_module_names
        modules.remove('multiprocessing.__init__')
    ValueError: list.remove(x): x not in list
    
    test test_bz2 failed -- Traceback (most recent call last):
      File "/home/serhiy/py/[cpython]/Lib/test/test_bz2.py", line 740, in testDecompressorChunksMaxsize
        self.assertFalse(bzd.needs_input)
    AssertionError: True is not false
    
    test test_multiprocessing_forkserver failed -- Traceback (most recent call last):
      File "/home/serhiy/py/[cpython]/Lib/test/_test_multiprocessing.py", line 4272, in test_import
        modules = self.get_module_names()
      File "/home/serhiy/py/[cpython]/Lib/test/_test_multiprocessing.py", line 4267, in get_module_names
        modules.remove('multiprocessing.__init__')
    ValueError: list.remove(x): x not in list
    
    test test_multiprocessing_spawn failed -- Traceback (most recent call last):
      File "/home/serhiy/py/[cpython]/Lib/test/_test_multiprocessing.py", line 4272, in test_import
        modules = self.get_module_names()
      File "/home/serhiy/py/[cpython]/Lib/test/_test_multiprocessing.py", line 4267, in get_module_names
        modules.remove('multiprocessing.__init__')
    ValueError: list.remove(x): x not in list

    The proposed PR adds glob.escape() to the above code:

    glob.glob(os.path.join(glob.escape(basedir), pattern))
    

    @serhiy-storchaka serhiy-storchaka added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jun 19, 2020
    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 9355868 by Serhiy Storchaka in branch 'master':
    bpo-41043: Escape literal part of the path for glob(). (GH-20994)
    9355868

    @encukou
    Copy link
    Member

    encukou commented Jul 1, 2020

    Would it be worth it to add a "base" keyword argument to glob.glob?

    @serhiy-storchaka
    Copy link
    Member Author

    It may be not exactly what you meant, but see bpo-38144. This issue actually was opened after I looked how that feature can be used in the stdlib and found that most of uses of glob() are vulnerable.

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset ecfecc2 by Serhiy Storchaka in branch '3.9':
    [3.9] bpo-41043: Escape literal part of the path for glob(). (GH-20994). (GH-21275)
    ecfecc2

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset e738962 by Serhiy Storchaka in branch '3.8':
    [3.8] bpo-41043: Escape literal part of the path for glob(). (GH-20994). (GH-21277)
    e738962

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants