This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients HWJ, amaury.forgeotdarc, benjamin.peterson, pitrou, vstinner
Date 2008-08-21.08:20:40
SpamBayes Score 1.5085001e-07
Marked as misclassified No
Message-id <1219306842.92.0.394002937162.issue3187@psf.upfronthosting.co.za>
In-reply-to
Content
If the filename can not be encoded correctly in the system charset, 
it's not really a problem. The goal is to be able to use open(), 
shutil.copyfile(), os.unlink(), etc. with the given filename.

orig = filename from the kernel (bytes)
filename = filename from listdir() (str)
dest = filename to the kernel (bytes)

The goal is to get orig == dest. In my program Hachoir, to workaround 
this problem I store the original filename (bytes) and convert it to 
unicode with characters replacements (eg. replace invalid byte 
sequence by "?"). So the bytes string is used for open(), 
unlink(), ... and the unicode string is displayed to stdout for the 
user.

IMHO, the best solution is to create such class:

class Filename:
    def __init__(self, orig):
        self.as_bytes = orig
        self.as_str = myformat(orig)
    def __str__(self):
        return self.as_str
    def __bytes__(self):
        return self.as_bytes

New problems: I guess that functions operating on filenames 
(os.path.*) will have to support this new type (Filename class).
History
Date User Action Args
2008-08-21 08:20:43vstinnersetrecipients: + vstinner, amaury.forgeotdarc, pitrou, benjamin.peterson, HWJ
2008-08-21 08:20:42vstinnersetmessageid: <1219306842.92.0.394002937162.issue3187@psf.upfronthosting.co.za>
2008-08-21 08:20:41vstinnerlinkissue3187 messages
2008-08-21 08:20:40vstinnercreate