This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author dlitz
Recipients HWJ, amaury.forgeotdarc, benjamin.peterson, dlitz, gvanrossum, loewis, pitrou, vstinner, zegreek
Date 2008-09-29.00:54:35
SpamBayes Score 9.326846e-06
Marked as misclassified No
Message-id <1222649677.79.0.00352087354963.issue3187@psf.upfronthosting.co.za>
In-reply-to
Content
Martin,

Consider this scenario.  On ext3/Linux, assume that UTF-8 is specified
in the system locale.  What would happen if you have two files, named
b"\xf3\xb3\x83\x80\x00" and b"\xc0\x00"?  Under your proposal, the first
file would decode successfully as "\U000f30c0\x00", and the second file
would decode unsuccessfully, so it would be mapped to
"\U000f30c0\x00"---the same thing!

Under your proposal, you could end up with multiple files having the
same filename (from Python's perspective). Python shouldn't break if
somebody deliberately created some weird filenames.  Your proposal would
make it impossible to write a robust remote backup tool in Python 3.

Pathnames on ext3/Linux *are not Unicode*.  Blindly pretending they're
Unicode is a leaky abstraction at best, and a security hole at worst.
History
Date User Action Args
2008-09-29 00:54:38dlitzsetrecipients: + dlitz, gvanrossum, loewis, amaury.forgeotdarc, pitrou, vstinner, benjamin.peterson, HWJ, zegreek
2008-09-29 00:54:37dlitzsetmessageid: <1222649677.79.0.00352087354963.issue3187@psf.upfronthosting.co.za>
2008-09-29 00:54:36dlitzlinkissue3187 messages
2008-09-29 00:54:35dlitzcreate