This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients BreamoreBoy, anthonybaxter, brett.cannon, eric.araujo, ezio.melotti, kristjan.jonsson, loewis, nnorwitz, theller, vstinner
Date 2010-09-04.01:00:37
SpamBayes Score 1.2261248e-11
Marked as misclassified No
Message-id <1283562041.55.0.916372457027.issue1552880@psf.upfronthosting.co.za>
In-reply-to
Content
Oh, I didn't see that the issue was specific to Python2. I updated the issue's title. If I understood correctly, the issue is also specific to Windows.

Do you know if your patch changes the public API? (break the compatibility)

--

FYI about Python3:

> That's an inventive way of breaking the unicode standard :)

It is described in the PEP 383 and it does solve a real and common issue: store a filename that cannot be decoded with the filesystem encoding. The operation is reversible. In Python 3.2, there are os.fsdecode() and os.fsencode() functions. On UNIX/BSD, os.encode(os.fsdecode(x)) is x, if x is a bytes object.

The PEP 383 introduces the surrogateescape error handler which does create surrogates on decode, and convert back surrogates to bytes on encode.

> Anyway, why would you worry about that? My patch doesn't use
> "surrogateescape" so there is no problem.

In Python3, filenames are stored as unicode. On UNIX/BSD, if a filename cannot be decode, it is encoded with surrogates. To get a full unicode support in Python3, you have to support surrogates.
History
Date User Action Args
2010-09-04 01:00:41vstinnersetrecipients: + vstinner, loewis, nnorwitz, brett.cannon, anthonybaxter, theller, kristjan.jonsson, ezio.melotti, eric.araujo, BreamoreBoy
2010-09-04 01:00:41vstinnersetmessageid: <1283562041.55.0.916372457027.issue1552880@psf.upfronthosting.co.za>
2010-09-04 01:00:38vstinnerlinkissue1552880 messages
2010-09-04 01:00:37vstinnercreate