Message 150053 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	pitrou
Recipients	benjamin.peterson, gz, pitrou, poolie, r.david.murray, vila, vstinner
Date	2011-12-21.23:26:11
SpamBayes Score	2.5534166e-07
Marked as misclassified	No
Message-id	<1324509930.3667.5.camel@localhost.localdomain>
In-reply-to	<CAA9uavDwz1NR1NRiviJBUdS6d+N7YyrnkKUYg_6-9oVeX-K06g@mail.gmail.com>

Content
> It is a de facto, not de jure standard: UTF-8 is how things are > typically stored. Other software (eg gnome file handling utilities) > makes this assumption. See eg > <http://www.cl.cam.ac.uk/~mgk25/unicode.html#linux>. So should we specifically detect Linux? And under which conditions? When the encoding is detected to be "ASCII"? > But in Unix > there are no ultimate authorities: even if someone announced filenames > are utf-8 there will obviously continue to be many machines where in > practice they are not. POSIX is kind of an authority. Freedesktop.org could be another. LSB yet another. (all with different scopes obviously) > I'm not sure what you expect a technical solution at the OS level > would look like. It doesn't need to be technical. It could just be a convention (all filesystem paths, and other user-visible text such as environment variables etc., are utf-8 encoded). Although enforcing it technically would of course be safer. > That is probably worth doing. But having no locale can still happen, > and I think Python could handle that better, so the changes are > complimentary. How do you detect "no locale"?

> It is a de facto, not de jure standard: UTF-8 is how things are
> typically stored.  Other software (eg gnome file handling utilities)
> makes this assumption.  See eg
> <http://www.cl.cam.ac.uk/~mgk25/unicode.html#linux>.

So should we specifically detect Linux? And under which conditions? When
the encoding is detected to be "ASCII"?

> But in Unix
> there are no ultimate authorities: even if someone announced filenames
> are utf-8 there will obviously continue to be many machines where in
> practice they are not.

POSIX is kind of an authority. Freedesktop.org could be another. LSB yet
another.
(all with different scopes obviously)

> I'm not sure what you expect a technical solution at the OS level
> would look like.

It doesn't need to be technical. It could just be a convention (all
filesystem paths, and other user-visible text such as environment
variables etc., are utf-8 encoded).
Although enforcing it technically would of course be safer.

> That is probably worth doing.  But having no locale can still happen,
> and I think Python could handle that better, so the changes are
> complimentary.

How do you detect "no locale"?

History
Date	User	Action	Args
2011-12-21 23:26:12	pitrou	set	recipients: + pitrou, vstinner, vila, benjamin.peterson, r.david.murray, gz, poolie
2011-12-21 23:26:11	pitrou	link	issue13643 messages
2011-12-21 23:26:11	pitrou	create