This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author a.badger
Recipients a.badger, loewis, vstinner
Date 2008-10-02.14:32:21
SpamBayes Score 1.5543122e-15
Marked as misclassified No
Message-id <1222957943.28.0.508732032829.issue4006@psf.upfronthosting.co.za>
In-reply-to
Content
It's not a feature it's a bug! :-)  (I hope you meant to have a smiley
too ;-)

As stated in the os.listdir() related bug, on Unix filesystems filenames
are a sequence of bytes.  The system encoding allows the user-level
tools to display the filenames as characters instead of byte sequences
and allows you to manipulate the filenames using characters instead of
byte sequences.  But if you change your locale the user level tools will
interpret the byte sequences as different characters and allow you free
access to create files in a different encoding.

So in order to work correctly on Unix you must be able to accept byte
sequences in place of filename.

The sad fact of the matter is that while we can be all unicode with data
and strings inside of python we will always have to be prepared to
handle supposed strings as byte sequences when talking to some things
outside of ourselves.  Sometimes the border has a specification that
tells us what encoding to expect and we can do conversion automatically.
 But when it doesn't we have to be prepared to 1) tell the user that the
data exists even but isn't string type as expected and 2) make the byte
sequence available to the user.

Silently pretending that the data doesn't exist at all is a bug (maybe a
minor bug depending on how often we expect the situation to arise but
still a bug.)
History
Date User Action Args
2008-10-02 14:32:23a.badgersetrecipients: + a.badger, loewis, vstinner
2008-10-02 14:32:23a.badgersetmessageid: <1222957943.28.0.508732032829.issue4006@psf.upfronthosting.co.za>
2008-10-02 14:32:22a.badgerlinkissue4006 messages
2008-10-02 14:32:21a.badgercreate