Author haypo
Recipients HWJ, amaury.forgeotdarc, benjamin.peterson, haypo, pitrou
Date 2008-08-21.08:20:40
SpamBayes Score 1.5085e-07
Marked as misclassified No
Message-id <1219306842.92.0.394002937162.issue3187@psf.upfronthosting.co.za>
In-reply-to
Content
If the filename can not be encoded correctly in the system charset, 
it's not really a problem. The goal is to be able to use open(), 
shutil.copyfile(), os.unlink(), etc. with the given filename.

orig = filename from the kernel (bytes)
filename = filename from listdir() (str)
dest = filename to the kernel (bytes)

The goal is to get orig == dest. In my program Hachoir, to workaround 
this problem I store the original filename (bytes) and convert it to 
unicode with characters replacements (eg. replace invalid byte 
sequence by "?"). So the bytes string is used for open(), 
unlink(), ... and the unicode string is displayed to stdout for the 
user.

IMHO, the best solution is to create such class:

class Filename:
    def __init__(self, orig):
        self.as_bytes = orig
        self.as_str = myformat(orig)
    def __str__(self):
        return self.as_str
    def __bytes__(self):
        return self.as_bytes

New problems: I guess that functions operating on filenames 
(os.path.*) will have to support this new type (Filename class).
History
Date User Action Args
2008-08-21 08:20:43hayposetrecipients: + haypo, amaury.forgeotdarc, pitrou, benjamin.peterson, HWJ
2008-08-21 08:20:42hayposetmessageid: <1219306842.92.0.394002937162.issue3187@psf.upfronthosting.co.za>
2008-08-21 08:20:41haypolinkissue3187 messages
2008-08-21 08:20:40haypocreate