Author pitrou
Recipients benjamin.peterson, gz, pitrou, poolie, r.david.murray, vila, vstinner
Date 2011-12-21.23:26:11
SpamBayes Score 2.55342e-07
Marked as misclassified No
Message-id <1324509930.3667.5.camel@localhost.localdomain>
In-reply-to <CAA9uavDwz1NR1NRiviJBUdS6d+N7YyrnkKUYg_6-9oVeX-K06g@mail.gmail.com>
Content
> It is a de facto, not de jure standard: UTF-8 is how things are
> typically stored.  Other software (eg gnome file handling utilities)
> makes this assumption.  See eg
> <http://www.cl.cam.ac.uk/~mgk25/unicode.html#linux>.

So should we specifically detect Linux? And under which conditions? When
the encoding is detected to be "ASCII"?

> But in Unix
> there are no ultimate authorities: even if someone announced filenames
> are utf-8 there will obviously continue to be many machines where in
> practice they are not.

POSIX is kind of an authority. Freedesktop.org could be another. LSB yet
another.
(all with different scopes obviously)

> I'm not sure what you expect a technical solution at the OS level
> would look like.

It doesn't need to be technical. It could just be a convention (all
filesystem paths, and other user-visible text such as environment
variables etc., are utf-8 encoded).
Although enforcing it technically would of course be safer.

> That is probably worth doing.  But having no locale can still happen,
> and I think Python could handle that better, so the changes are
> complimentary.

How do you detect "no locale"?
History
Date User Action Args
2011-12-21 23:26:12pitrousetrecipients: + pitrou, vstinner, vila, benjamin.peterson, r.david.murray, gz, poolie
2011-12-21 23:26:11pitroulinkissue13643 messages
2011-12-21 23:26:11pitroucreate