Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.listdir can return byte strings #47437

Closed
HWJ mannequin opened this issue Jun 24, 2008 · 79 comments
Closed

os.listdir can return byte strings #47437

HWJ mannequin opened this issue Jun 24, 2008 · 79 comments
Assignees
Labels
docs Documentation in the Doc dir type-bug An unexpected behavior, bug, or error

Comments

@HWJ
Copy link
Mannequin

HWJ mannequin commented Jun 24, 2008

BPO 3187
Nosy @gvanrossum, @loewis, @birkenfeld, @amauryfa, @pitrou, @vstinner, @benjaminp, @djc
Files
  • posix_path_bytes.patch: Patch posixpath.join() to support bytes
  • io_byte_filename.patch: open() allows bytes filename
  • fnmatch_bytes.patch: Patch fnmatch.filter() to accept bytes filenames
  • glob1_bytes.patch: Fix glob.glob() to accept invalid directory name
  • listdir_encoding_warning.patch
  • warn_at_the_end.patch
  • raise_decoding_errors.patch
  • force_unicode.patch
  • getcwd_bytes.patch: getcwd() returns bytes if unicode conversion fails
  • merge_os_getcwd_getcwdu.patch: Remove os.getcwdu(); os.getcwd(bytes=True) returns bytes
  • os_getcwdb.patch: Fix getcwd() (use PyUnicode_Decode) and create getcwdb()->bytes
  • python3_bytes_filename.patch: Patch for an initial support of bytes filename in Python3
  • setfsenc.diff
  • python3_bytes_filename-3.patch: Patch for an initial support of bytes filename in Python3 (version 3)
  • win32-bytes-filenames.patch
  • macpath.patch
  • library_os_doc.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/birkenfeld'
    closed_at = <Date 2008-10-07.07:12:42.880>
    created_at = <Date 2008-06-24.10:28:25.633>
    labels = ['type-bug', 'docs']
    title = 'os.listdir can return byte strings'
    updated_at = <Date 2008-10-07.07:12:42.879>
    user = 'https://bugs.python.org/HWJ'

    bugs.python.org fields:

    activity = <Date 2008-10-07.07:12:42.879>
    actor = 'loewis'
    assignee = 'georg.brandl'
    closed = True
    closed_date = <Date 2008-10-07.07:12:42.880>
    closer = 'loewis'
    components = ['Documentation']
    creation = <Date 2008-06-24.10:28:25.633>
    creator = 'HWJ'
    dependencies = []
    files = ['11212', '11213', '11215', '11216', '11549', '11550', '11581', '11630', '11632', '11652', '11655', '11658', '11663', '11680', '11685', '11693', '11721']
    hgrepos = []
    issue_num = 3187
    keywords = ['patch']
    message_count = 79.0
    messages = ['68674', '68679', '68684', '68685', '68686', '68688', '68689', '70943', '70953', '71525', '71612', '71615', '71624', '71629', '71647', '71648', '71655', '71680', '71699', '71700', '71705', '71748', '71749', '71751', '71752', '71756', '71757', '71769', '71991', '72495', '73362', '73534', '73535', '73540', '73678', '73680', '73688', '73909', '73910', '73911', '73925', '73926', '73992', '73999', '74000', '74006', '74007', '74008', '74027', '74032', '74059', '74080', '74083', '74101', '74173', '74186', '74192', '74222', '74236', '74237', '74240', '74241', '74242', '74246', '74255', '74256', '74257', '74266', '74267', '74268', '74270', '74271', '74275', '74276', '74277', '74409', '74412', '74414', '74426']
    nosy_count = 13.0
    nosy_names = ['gvanrossum', 'loewis', 'georg.brandl', 'amaury.forgeotdarc', 'pitrou', 'vstinner', 'draghuram', 'benjamin.peterson', 'djc', 'HWJ', 'dlitz', 'zegreek', 'bboissin']
    pr_nums = []
    priority = 'critical'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue3187'
    versions = ['Python 3.0']

    @HWJ
    Copy link
    Mannequin Author

    HWJ mannequin commented Jun 24, 2008

    The script below produces 1664 lines of output before it bails out with
    Traceback (most recent call last):
      File "WalkBug.py", line 5, in <module>
    for Dir, SubDirs, Files in os.walk('/home/jarausch') :
      File "/usr/local/lib/python3.0/os.py", line 278, in walk
    for x in walk(path, topdown, onerror, followlinks):
      File "/usr/local/lib/python3.0/os.py", line 268, in walk
    if isdir(join(top, name)):
      File "/usr/local/lib/python3.0/posixpath.py", line 64, in join
    if b.startswith('/'):
    TypeError: expected an object with the buffer interface

    =========================
    file WalkBug.py:

    #!/usr/local/bin/python3.0

    import os
    
    for Dir, SubDirs, Files in os.walk('/home/jarausch') :
      print("processing {0:d} files in {1}".format(len(Files),Dir))

    @HWJ HWJ mannequin added type-crash A hard crash of the interpreter, possibly with a core dump stdlib Python modules in the Lib dir labels Jun 24, 2008
    @amauryfa
    Copy link
    Member

    Could you tell us what this 1665th line should be?
    Maybe the 1665th directory has something special (a filename with spaces
    or non-ascii chars...)

    Can you try with an older version of python?

    @benjaminp
    Copy link
    Contributor

    It's failing because he's giving a string to bytes.startswith when it
    requires a byte string or such.

    @amauryfa
    Copy link
    Member

    "he's giving a string"... the user simply called os.walk, which accepts
    strings AFAIK.

    We should discover what produced this bytestring. Does listdir() returns
    a mixed list of strings and bytes?

    @amauryfa amauryfa removed the invalid label Jun 24, 2008
    @amauryfa amauryfa reopened this Jun 24, 2008
    @benjaminp
    Copy link
    Contributor

    It seems the conversion to unicode strings (PyUnicode vs PyBytes) was
    not complete in os.listdir. See the attached patch.

    @amauryfa
    Copy link
    Member

    The original problem seems to come from some Unix platform, but this
    patch only handles two cases:

    • on win32, when the argument is a bytestring.
    • on OS/2.
      And in both cases, the default (utf-8) conversion seems wrong. Something
      like cp1252 (the ANSI code page for Western Windows) would be more sensible.

    In the posix part of the function, there is the comment (2003-03-04):
    /* fall back to the original byte string, as
    discussed in patch bpo-683592 */
    btw, I find the penultimate message of this other thread very pleasant,
    in the py3k context... I suppose the conclusions would not be the same
    today.

    @HWJ
    Copy link
    Mannequin Author

    HWJ mannequin commented Jun 24, 2008

    > Could you tell us what this 1665th line should be?
    > Maybe the 1665th directory has something special (a filename with >>
    > spaces or non-ascii chars...)

    Yes, the next directory contains a filename with an iso-latin1 but non-
    ascii character

    > Can you try with an older version of python?
    No problems - runs every night here

    The patch (applied to SVN GMT 13:30) does NOT help.

    @benjaminp benjaminp changed the title os.walk - strange bug os.listdir can return byte strings Jun 25, 2008
    @pitrou
    Copy link
    Member

    pitrou commented Aug 9, 2008

    Hmm, I suppose that while the filename is latin1-encoded,
    Py_FileSystemDefaultEncoding is "utf-8" and therefore os.listdir fails
    decoding the filename and falls back on returning a byte string.
    It was acceptable in Python 2.x but is a very annoying problem in py3k
    now that unicode and bytes objects can't be mixed together anymore. I'm
    bumping this to critical, although there is probably no clean solution.

    @pitrou pitrou added type-bug An unexpected behavior, bug, or error and removed type-crash A hard crash of the interpreter, possibly with a core dump labels Aug 9, 2008
    @benjaminp
    Copy link
    Contributor

    Let's make this a release blocker for RCs.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 20, 2008

    See bpo-3616 for a consequence of this.

    @vstinner
    Copy link
    Member

    If the filename can not be encoded correctly in the system charset,
    it's not really a problem. The goal is to be able to use open(),
    shutil.copyfile(), os.unlink(), etc. with the given filename.

    orig = filename from the kernel (bytes)
    filename = filename from listdir() (str)
    dest = filename to the kernel (bytes)

    The goal is to get orig == dest. In my program Hachoir, to workaround
    this problem I store the original filename (bytes) and convert it to
    unicode with characters replacements (eg. replace invalid byte
    sequence by "?"). So the bytes string is used for open(),
    unlink(), ... and the unicode string is displayed to stdout for the
    user.

    IMHO, the best solution is to create such class:

    class Filename:
        def __init__(self, orig):
            self.as_bytes = orig
            self.as_str = myformat(orig)
        def __str__(self):
            return self.as_str
        def __bytes__(self):
            return self.as_bytes

    New problems: I guess that functions operating on filenames
    (os.path.*) will have to support this new type (Filename class).

    @pitrou
    Copy link
    Member

    pitrou commented Aug 21, 2008

    Selon STINNER Victor <report@bugs.python.org>:

    IMHO, the best solution is to create such class:

    class Filename:
    def __init__(self, orig):
    self.as_bytes = orig
    self.as_str = myformat(orig)
    def __str__(self):
    return self.as_str
    def __bytes__(self):
    return self.as_bytes

    I agree that logically it's the right solution. It's also the most invasive. If
    that class is made a subclass of str, however, existing code shouldn't break
    more than it currently does.

    @vstinner
    Copy link
    Member

    I wrote a Filename class. I tries different methods:

    • no parent class "class Filename: ..." -> I don't know how to make
      bytes(filename) works!? But it's the best option to avoid strange bugs
      (mix bytes/str, remember Python 2.x...)
    • str parent class "class Filename(str): ..." -> doesn't work because
      os functions uses the fake unicode filename before testing the bytes
      (real) filename
    • bytes parent class "class Filename(bytes): ..." -> that's the
      current implementation

    The idea is to encode str -> bytes (and not bytes -> str because we
    want to avoid problems with such conversions). So I reimplemented most
    bytes methods: __addr__, __raddr__, __contains__, startswith, endswith
    and index. index method has no start/end arguments since the behaviour
    would be different than a real unicode string :-/

    I added an example of fixed os.listdir(): create Filename() object if
    we get bytes. Should we always create Filename objects? I don't think
    so.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 21, 2008

    • bytes parent class "class Filename(bytes): ..." -> that's the
      current implementation

    I don't think that makes sense (especially under Windows which has Unicode file
    APIs). os.listdir() and friends should really return str or str-like objects,
    not bytes-like objects with an additional __str__ method.

    • str parent class "class Filename(str): ..." -> doesn't work because
      os functions uses the fake unicode filename before testing the bytes
      (real) filename

    Well, of course, if we create a filename type, then all os functions must be
    adapted to accept it rather than assume str.

    All this is highly speculative of course, and if we really follow this course
    (i.e. create a filename type) it should probably be postponed to 3.1: too many
    changes with far-reaching consequences.

    @vstinner
    Copy link
    Member

    Le Thursday 21 August 2008 14:55:43 Antoine Pitrou, vous avez écrit :

    > * bytes parent class "class Filename(bytes): ..." -> that's the
    > current implementation

    I don't think that makes sense (especially under Windows which has Unicode
    file APIs). os.listdir() and friends should really return str or str-like
    objects, not bytes-like objects with an additional __str__ method.

    In we use "class Filename(str): ...", we have to ensure that all operations
    takes care of the charset because the unicode version is invalid and not be
    used to access to the file system. Dummy example: Filename()+"/" should not
    return str but raise an error or create a new filename.

    Well, of course, if we create a filename type, then all os functions must
    be adapted to accept it rather than assume str.

    If Filename has no parent class but is convertible to bytes(), os functions
    requires no change and so we can fix it before final 3.0 ;-)

    @pitrou
    Copy link
    Member

    pitrou commented Aug 21, 2008

    If Filename has no parent class but is convertible to bytes(), os
    functions requires no change and so we can fix it before final 3.0 ;-)

    This sounds highly optimistic.

    Also, I think it's wrong to introduce a string-like class with implicit
    conversion both to bytes and to str, while we have taken all measures to
    make sure that bytes/str exchangeability doesn't exist any more in py3k.

    @gvanrossum
    Copy link
    Member

    The proper work-around is for the app to pass bytes into os.listdir();
    then it will return bytes. It would be nice if open() etc. accepted
    bytes (as well as strings of course), at least on Unix, but not
    absolutely necessary -- the app could also just know the right encoding.

    I see two reasonable alternatives for what os.listdir() should return
    when the input is a string and one of the filenames can't be decoded:
    either omit it from the output list; or use errors='replace' in the
    encoding. Failing the entire os.listdir() call is not acceptable, and
    neither is returning a mixture of str and bytes instances.

    @vstinner
    Copy link
    Member

    Le Thursday 21 August 2008 18:17:47 Guido van Rossum, vous avez écrit :

    The proper work-around is for the app to pass bytes into os.listdir();
    then it will return bytes.

    In my case, I just would like to remove a directory with shutil.rmtree(). I
    don't know if it contains bytes or characters filenames :-)

    It would be nice if open() etc. accepted
    bytes (as well as strings of course), at least on Unix, but not
    absolutely necessary -- the app could also just know the right encoding.

    An invalid filename has no charset. It's just a "raw" byte string. So open(),
    unlink(), etc. have to accept byte string. Maybe not in the Python version
    with in low level (C version)?

    I see two reasonable alternatives for what os.listdir() should return
    when the input is a string and one of the filenames can't be decoded:
    either omit it from the output list;

    It's not a good option: rmtree() will fails because the directory in not
    empty :-/

    or use errors='replace' in the encoding.

    It will also fails because filenames will be invalid (valid unicode string but
    non existent file names :-/).

    Failing the entire os.listdir() call is not acceptable, and
    neither is returning a mixture of str and bytes instances.

    Ok, I have another suggestion:

    • *by default*, listdir() only returns str and raise an error (TypeError?)
      on invalid filename
    • add an optional argument (a callback), eg. "fallback_encoder", to catch
      such errors (similar to "onerror" from shutils.rmtree())

    Example of new listdir implementation (pseudo-code):

       charset = sys.getfilesystemcharset()
       dirobj = opendir(path)
       try:
          for bytesname in readdir(dirobj):
              try:
                  name = str(bytesname, charset)
              exept UnicodeDecodeError:
                  name = fallback_encoder(bytesname)
              yield name
       finally:
          closedir(dirobj)

    The default fallback_encoder:

       def fallback_encoder(name):
          raise

    Keep raw bytes string:

       def fallback_encoder(name):
          return name

    Create my custom filename object:

       class Filename:
          ...
    
       def fallback_encoder(name):
          return Filename(name)

    If a callback is overkill, we can just add an option,
    eg. "keep_invalid_filename=True", to ask listdir() to keep bytes string if
    the conversion to unicode fails.

    In any case, open(), unlink(), etc. have to accept byte string to be accept to
    read, copy, remove invalid filenames. In a perfect world, all filenames would
    be valid UTF-8 strings, but in the real world (think to Matrix :-)), we have
    to support such strange cases...

    @pitrou
    Copy link
    Member

    pitrou commented Oct 3, 2008

    Le vendredi 03 octobre 2008 à 11:43 +0000, STINNER Victor a écrit :

    STINNER Victor <victor.stinner@haypocalc.com> added the comment:

    > The most generic way of allowing all bytes-alike objects is to write:
    > path = bytes(path)

    If you use that, any unicode may fails and the function will always return
    unicode. The goal is to get:
    func(bytes)->bytes
    func(bytearray)->bytes (or maybe bytearray, it doesn't matter)
    func(unicode)->unicode

    Then make it:

        path = path if isinstance(path, str) else bytes(path)

    @vstinner
    Copy link
    Member

    vstinner commented Oct 3, 2008

    path=path is useless most of the code (unicode path), this code is 
    faster if both cases (bytes or unicode)!
       if not isinstance(path, str):
          path = bytes(path)
    • a if b else c: unicode=0.756730079651; bytes=1.93071103096
    • if test: path=...: unicode=0.681571006775; bytes=1.88843798637

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 3, 2008

    I've committed sys.setfilesystemencoding as r66769.

    Declaring it as a documentation issue now. Not sure whether it should
    remain a release blocker; IMO, the documentation can still be produced
    after the release.

    @loewis loewis mannequin added docs Documentation in the Doc dir and removed stdlib Python modules in the Lib dir labels Oct 3, 2008
    @loewis loewis mannequin assigned birkenfeld and unassigned loewis Oct 3, 2008
    @gvanrossum
    Copy link
    Member

    Reducing priority to critical, it's just docs and tweaks from here.

    You should also support bytearray() in ntpath:

    isinstance(path, (bytes, bytearray))

    No, you shouldn't. I changed my mind on this several times and in the
    end figured it's good enough to just support bytes and str instances.

    Amaury: I've reviewed your patch and ran test_ntpath.py on a Linux box.
    I get this traceback:

    ======================================================================
    ERROR: test_relpath (main.TestNtpath)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Lib/test/test_ntpath.py", line 188, in test_relpath
        tester('ntpath.relpath("a")', 'a')
      File "Lib/test/test_ntpath.py", line 22, in tester
        gotResult = eval(fn)
      File "<string>", line 1, in <module>
      File "/usr/local/google/home/guido/python/py3k/Lib/ntpath.py", line
    530, in relpath
        start_list = abspath(start).split(sep)
      File "/usr/local/google/home/guido/python/py3k/Lib/ntpath.py", line
    499, in abspath
        path = join(os.getcwd(), path)
      File "/usr/local/google/home/guido/python/py3k/Lib/ntpath.py", line
    137, in join
        if b[:1] in seps:
    TypeError: 'in <string>' requires string as left operand, not bytes

    The fix is to change the fallback abspath to this code:

        def abspath(path):
            """Return the absolute version of a path."""
            if not isabs(path):
                if isinstance(path, bytes):
                    cwd = os.getcwdb()
                else:
                    cwd = os.getcwd()
                path = join(cwd, path)
            return normpath(path)

    Once you fix that please check it in!

    @gvanrossum
    Copy link
    Member

    Assigning to Amaury for Windows fix first.

    @gvanrossum gvanrossum assigned amauryfa and unassigned birkenfeld Oct 3, 2008
    @amauryfa
    Copy link
    Member

    amauryfa commented Oct 3, 2008

    Thanks for testing the non-Windows part of ntpath.
    Committed patch in r66777.

    Leaving the issue open: macpath.py should certainly be modified.

    @amauryfa amauryfa removed their assignment Oct 3, 2008
    @gvanrossum
    Copy link
    Member

    Sorry Amaury, but there's another issue.

    test_ntpath now fails when run with -bb:

    ======================================================================
    ERROR: test_expandvars (main.TestNtpath)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Lib/test/test_ntpath.py", line 151, in test_expandvars
        tester('ntpath.expandvars("$foo bar")', "bar bar")
      File "Lib/test/test_ntpath.py", line 10, in tester
        gotResult = eval(fn)
      File "<string>", line 1, in <module>
      File "/usr/local/google/home/guido/python/py3k/Lib/ntpath.py", line
    344, in expandvars
        if c in ('\'', b'\''):   # no expansion within single quotes
    BytesWarning: Comparison between bytes and string

    ======================================================================
    ERROR: test_normpath (main.TestNtpath)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Lib/test/test_ntpath.py", line 120, in test_normpath
        tester("ntpath.normpath('A//////././//.//B')", r'A\B')
      File "Lib/test/test_ntpath.py", line 10, in tester
        gotResult = eval(fn)
      File "<string>", line 1, in <module>
      File "/usr/local/google/home/guido/python/py3k/Lib/ntpath.py", line
    465, in normpath
        if comps[i] in ('.', '', b'.', b''):
    BytesWarning: Comparison between bytes and string

    ======================================================================
    ERROR: test_relpath (main.TestNtpath)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Lib/test/test_ntpath.py", line 188, in test_relpath
        tester('ntpath.relpath("a")', 'a')
      File "Lib/test/test_ntpath.py", line 10, in tester
        gotResult = eval(fn)
      File "<string>", line 1, in <module>
      File "/usr/local/google/home/guido/python/py3k/Lib/ntpath.py", line
    534, in relpath
        start_list = abspath(start).split(sep)
      File "/usr/local/google/home/guido/python/py3k/Lib/ntpath.py", line
    504, in abspath
        return normpath(path)
      File "/usr/local/google/home/guido/python/py3k/Lib/ntpath.py", line
    465, in normpath
        if comps[i] in ('.', '', b'.', b''):
    BytesWarning: Comparison between bytes and string

    @gvanrossum
    Copy link
    Member

    FWIW, I don't see a need to change macpath.py -- it's only used for
    MacOS 9 and the occasional legacy app. OSX uses posixpath.py.

    @amauryfa
    Copy link
    Member

    amauryfa commented Oct 3, 2008

    Committed r66779: test_ntpath now passes with the -bb option.

    It seems that the Windows buildbots do not set -bb.

    @gvanrossum
    Copy link
    Member

    Thanks Amaury!

    On to Georg for doc tweaks. Summary:

    • all the os.path functions now work on bytes as well, on all platforms
    • only on Unix (but not OSX) do we recommend using bytes
    • os.getcwdu() no longer exists
    • os.getcwdb() returns bytes
    • os.listdir(<str>) skips undecodable entries (previously it returned a
      mixture of str and bytes instances)
    • open() accepts bytes as filename

    Stuff that didn't change but that you might want to mention:

    • all the syscalls in os support bytes args; readlink() and listdir()
      return bytes if the arg is bytes
    • getcwd() may raise UnicodeDecodeError

    Martin already documented sys.setfilesystemencoding().

    @gvanrossum gvanrossum assigned birkenfeld and unassigned amauryfa Oct 3, 2008
    @amauryfa
    Copy link
    Member

    amauryfa commented Oct 3, 2008

    I have a patch for macpath.py nonetheless.
    Tested on Windows (of course ;-) but all functions are pure text
    manipulation, except realpath(). It was much easier than ntpath.py.

    I also added tests for three functions which were not exercised at all.

    @benjaminp
    Copy link
    Contributor

    Amaury, you're patch looks good.

    @amauryfa
    Copy link
    Member

    amauryfa commented Oct 3, 2008

    Committed macpath.py in r66781.

    @vstinner
    Copy link
    Member

    vstinner commented Oct 6, 2008

    Would it possible to close this issue since os.listdir() is fixed and
    many other related functions (posix, posixpath, ntpath, macpath, etc.)
    are also fixed? I propose to open new issues for new bugs since this
    issue becomes a little big long :)

    Eg. see new issues bpo-4035 and bpo-4036!

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 6, 2008

    Would it possible to close this issue since os.listdir() is fixed and
    many other related functions (posix, posixpath, ntpath, macpath, etc.)
    are also fixed?

    IIUC, these fixes are still not complete: they lack documentation
    changes. Of course, it would have been better if the original patches
    already contained the necessary documentation and test suite changes.
    See msg74271 for what Guido considers the lacking documentation;
    you may find that other aspects also need documentation.

    As for test cases: it seems that those got waived, in the hurry.

    @vstinner
    Copy link
    Member

    vstinner commented Oct 6, 2008

    Le Tuesday 07 October 2008 01:13:22 Martin v. Löwis, vous avez écrit :

    IIUC, these fixes are still not complete: they lack documentation
    changes. (...) Of course, it would have been better if the original patches
    already contained the necessary documentation and test suite changes.

    Most (or all) patches include new tests about bytes. Here is a patch for
    os.rst documentation about listdir(), getcwdb() and readlink().

    See msg74271 for what Guido considers the lacking documentation;
    you may find that other aspects also need documentation.

    I wrote a long document about bytes for filenames but not only. I'm still
    waiting for some contributors or reviewers:
    http://wiki.python.org/moin/Python3UnicodeDecodeError

    As for test cases: it seems that those got waived, in the hurry.

    Can you be more precise? Which tests have to be improved/rewritten?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 7, 2008

    Most (or all) patches include new tests about bytes. Here is a patch for
    os.rst documentation about listdir(), getcwdb() and readlink().

    Thanks! Committed as r66829.

    I've added additional documentation in r66830, which should complete
    Guido's list of things to be documented. So the issue can be closed
    now.

    > See msg74271 for what Guido considers the lacking documentation;
    > you may find that other aspects also need documentation.

    I wrote a long document about bytes for filenames but not only. I'm still
    waiting for some contributors or reviewers:
    http://wiki.python.org/moin/Python3UnicodeDecodeError

    We should discuss that on python-dev, of course - the question is
    whether additional documentation patches are needed in response to
    this specific change.

    > As for test cases: it seems that those got waived, in the hurry.

    Can you be more precise? Which tests have to be improved/rewritten?

    I was probably looking at the wrong patches (such as getcwd_bytes.patch,
    merge_os_getcwd_getcwdu.patch, etc); I now see that the final patch did
    have tests. I recommend that patches that get superseded by other
    patches are removed from the issue. The won't be deleted; it's still
    possible to navigate to them through the History at the bottom of the
    issue.

    @loewis loewis mannequin closed this as completed Oct 7, 2008
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    docs Documentation in the Doc dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants