Author pitrou
Recipients Arfrever, mrabarnett, pitrou, r.david.murray, tchrist, terry.reedy
Date 2011-08-13.20:26:20
SpamBayes Score 9.18099e-07
Marked as misclassified No
Message-id <>
> Here's why I say that Python uses UTF-16 not UCS-2 on its narrow builds.
> Perhaps someone could tell me why the Python documentation says it uses
> UCS-2 on a narrow build.

There's a disagreement on that point between several developers. See an example sub-thread at:

> Since you are already using a variable-width encoding, why the
> supercilious attitude toward UTF-8?

I think you are reading too much into these decisions. It's simply that no-one took the time to write an alternative implementation and demonstrate its superiority. I also believe the original implementation was UCS-2 and surrogate support was added progressively during the years. Hence the terminological mess and the ad-hoc semantics.

I agree that going with UTF-8 and a clever indexing scheme would be a better solution.
Date User Action Args
2011-08-13 20:26:21pitrousetrecipients: + pitrou, terry.reedy, mrabarnett, Arfrever, r.david.murray, tchrist
2011-08-13 20:26:21pitrousetmessageid: <>
2011-08-13 20:26:21pitroulinkissue12729 messages
2011-08-13 20:26:20pitroucreate