Message 113352 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	lemburg, loewis, vstinner
Date	2010-08-08.23:56:45
SpamBayes Score	2.165082e-09
Marked as misclassified	No
Message-id	<1281311809.81.0.46043022196.issue9542@psf.upfronthosting.co.za>
In-reply-to

Content
For my work on #9425 (Rewrite import machinery to work with unicode paths), I need a PyArg_Parse converter converting bytes and str to str. PyUnicode_FSConverter() is the opposite because it encodes str to bytes. To handle (input) filenames in a function, we have 3 choices: 1/ use bytes: that's the current choice for most Python functions. It gives full unicode support for POSIX OSes (FS using a bytes API), but it is not enough for Windows (Windows uses mbcs encoding which is a very small subset of Unicode) 2/ use str with the PEP 383 (surrogateescape): it begins to be used in Python 3.1, and more seriously in Python 3.2. It offers full unicode support on all OSes (POSIX and Windows) 3/ use the native type for each OS (bytes on POSIX, str on Windows): I dislike this solution because it implies code duplication PyUnicode_FSConverter() is the converter for solution (1). PyUnicode_FSDecoder() will be the converter for the solution (2).

For my work on #9425 (Rewrite import machinery to work with unicode paths), I need a PyArg_Parse converter converting bytes and str to str. PyUnicode_FSConverter() is the opposite because it encodes str to bytes.

To handle (input) filenames in a function, we have 3 choices:

 1/ use bytes: that's the current choice for most Python functions. It gives full unicode support for POSIX OSes (FS using a bytes API), but it is not enough for Windows (Windows uses mbcs encoding which is a very small subset of Unicode)
 2/ use str with the PEP 383 (surrogateescape): it begins to be used in Python 3.1, and more seriously in Python 3.2. It offers full unicode support on all OSes (POSIX and Windows)
 3/ use the native type for each OS (bytes on POSIX, str on Windows): I dislike this solution because it implies code duplication

PyUnicode_FSConverter() is the converter for solution (1). PyUnicode_FSDecoder() will be the converter for the solution (2).

History
Date	User	Action	Args
2010-08-08 23:56:50	vstinner	set	recipients: + vstinner, lemburg, loewis
2010-08-08 23:56:49	vstinner	set	messageid: <1281311809.81.0.46043022196.issue9542@psf.upfronthosting.co.za>
2010-08-08 23:56:47	vstinner	link	issue9542 messages
2010-08-08 23:56:46	vstinner	create