This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients Arfrever, ezio.melotti, gregory.p.smith, lemburg, loewis, vstinner
Date 2010-04-26.11:44:53
SpamBayes Score 1.2521678e-10
Marked as misclassified No
Message-id <201004261344.46779.victor.stinner@haypocalc.com>
In-reply-to <4BD575FF.6010203@egenix.com>
Content
> In real life applications, you do run into these problems quite
> often

Yes, I'm agree 100% with you :-)

> > Python3 prefers unicode, eg. print expects an unicode string, not a byte
> > string. I mean it's more pratical to use unicode everywhere in Python,
> > and so fsencode()/fsdecode() can be really useful on POSIX systems.
> 
> Sure, but forcing UnicodeDecodeErrors upon Python3 programmers is
> not a good idea. Please keep that in mind.

I proposed to reject bytes on Windows because Martin (who knows Windows better 
than me) decided to *not* support byte string on Windows. Windows native API 
uses unicode, and conversion from bytes and unicode on Windows using "mbcs" is 
not reliable (it depends on the locale, and it may loose some informations).

http://mail.python.org/pipermail/python-dev/2010-April/099556.html

Reject byte string on Windows is just a suggestion. To support byte strings on 
Windows, each Python function written in C should be fixed to use the ANSI 
version instead of the Wide version (eg. CreateProcessA instead of 
CreateProcessW) if it gets byte arguments. The code would become twice bigger, 
and it introduces new issues: which function should be choosen if there are 
two arguments, one is a byte string, and the other an unicode string? 
_subprocess.CreateProcess has 9 arguments...

Since unicode is a superset of MBCS and MBCS has subtle bugs, it's preferable 
to use (force) unicode.

--

But on POSIX, it's the opposite: I'm doing my best to support byte string 
everywhere (filenames, environment variables, etc.). See the dependency list 
of my "meta" issue #8242.

The first goal of fsencode() is to accept byte strings on POSIX systems. 
Maybe, I didn't explained it correctly.
History
Date User Action Args
2010-04-26 11:44:56vstinnersetrecipients: + vstinner, lemburg, loewis, gregory.p.smith, ezio.melotti, Arfrever
2010-04-26 11:44:54vstinnerlinkissue8514 messages
2010-04-26 11:44:53vstinnercreate