Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

open() rejects bytes as filename #47938

Closed
DLitz mannequin opened this issue Aug 26, 2008 · 2 comments
Closed

open() rejects bytes as filename #47938

DLitz mannequin opened this issue Aug 26, 2008 · 2 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@DLitz
Copy link
Mannequin

DLitz mannequin commented Aug 26, 2008

BPO 3688
Nosy @amauryfa
Superseder
  • bpo-3187: os.listdir can return byte strings
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2008-08-26.17:14:06.776>
    created_at = <Date 2008-08-26.17:06:01.316>
    labels = ['type-bug', 'library']
    title = 'open() rejects bytes as filename'
    updated_at = <Date 2008-08-26.17:14:06.753>
    user = 'https://bugs.python.org/dlitz'

    bugs.python.org fields:

    activity = <Date 2008-08-26.17:14:06.753>
    actor = 'amaury.forgeotdarc'
    assignee = 'none'
    closed = True
    closed_date = <Date 2008-08-26.17:14:06.776>
    closer = 'amaury.forgeotdarc'
    components = ['Library (Lib)']
    creation = <Date 2008-08-26.17:06:01.316>
    creator = 'dlitz'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 3688
    keywords = []
    message_count = 2.0
    messages = ['71986', '71988']
    nosy_count = 2.0
    nosy_names = ['amaury.forgeotdarc', 'dlitz']
    pr_nums = []
    priority = 'normal'
    resolution = 'duplicate'
    stage = None
    status = 'closed'
    superseder = '3187'
    type = 'behavior'
    url = 'https://bugs.python.org/issue3688'
    versions = ['Python 3.0']

    @DLitz
    Copy link
    Mannequin Author

    DLitz mannequin commented Aug 26, 2008

    On Linux/ext3, filenames are stored natively as sequences of octets. On
    Win32/NTFS, they are stored natively as sequences of Unicode code points.

    In Python 2.x, the way to unambiguously open a particular file was to
    pass the filename as a str object on Linux/ext3 and as a unicode object
    on Win32/NTFS. os.listdir(".") would return every filename as a str
    object, and os.listdir(u".") would return every filename as a unicode
    object---based on the current locale settings---except for filenames
    that couldn't be decoded that way.

    Consider this bash script (executed on Linux under a UTF-8 locale):

    export LC_CTYPE=en_CA.UTF-8 # requires the en_CA.UTF-8 locale to be
    built
    mkdir /tmp/foo
    cd /tmp/foo
    touch $'UTF-8 compatible filename\xc2\xa2'
    touch $'UTF-8 incompatible filename\xc0'

    Under Python 2.52, you get this:
      >>> import os
      >>> os.listdir(u".")
      ['UTF-8 incompatible filename\xc0', u'UTF-8 compatible filename\xa2']
      >>> os.listdir(".")
      ['UTF-8 incompatible filename\xc0', 'UTF-8 compatible filename\xc2\xa2']
      >>> [open(f, "r") for f in os.listdir(u".")]
      [<open file 'UTF-8 incompatible filename�, mode 'r' at 0xb7cee578>,
    <open file 'UTF-8 compatible filename¢', mode 'r' at 0xb7cee6e0>]
    
    Under Python 3.0b3, you get this:
      >>> import os
      >>> os.listdir(".")
      [b'UTF-8 incompatible filename\xc0', 'UTF-8 compatible filename¢']
      >>> os.listdir(b".")
      [b'UTF-8 incompatible filename\xc0', b'UTF-8 compatible filename\xc2\xa2']
      >>> [open(f, "r") for f in os.listdir(".")]
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "<stdin>", line 1, in <listcomp>
        File "/home/dwon/python3.0b3/lib/python3.0/io.py", line 284, in __new__
          return open(*args, **kwargs)
        File "/home/dwon/python3.0b3/lib/python3.0/io.py", line 184, in open
          raise TypeError("invalid file: %r" % file)
      TypeError: invalid file: b'UTF-8 incompatible filename\xc0'

    This behaviour of open() makes it impossible to write code that opens
    arbitrarily-named files on Linux/ext3.

    @DLitz DLitz mannequin added OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error and removed OS-windows labels Aug 26, 2008
    @amauryfa
    Copy link
    Member

    This is actively being discussed (and developed) in bpo-3187

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant