Author pitrou
Recipients Arfrever, ezio.melotti, jkloth, mrabarnett, pitrou, r.david.murray, tchrist, terry.reedy
Date 2011-08-14.17:36:41
SpamBayes Score 9.15579e-12
Marked as misclassified No
Message-id <1313343278.3574.15.camel@localhost.localdomain>
In-reply-to <25916.1313340867@chthon>
Content
> > The UTF-8 codec described by RFC 2279 didn't say so, so, since our
> > codec was following RFC 2279, it was producing valid UTF-8.  With RFC
> > 3629 a number of things changed in a non-backward compatible way.
> > Therefore we couldn't just change the behavior of the UTF-8 codec nor
> > rename it to something else in Python 2.  We had to wait till Python 3
> > in order to fix it.
> 
> I'm a bit confused on this.  You no longer fix bugs in Python 2?

In general, we try not to introduce changes that have a high probability
of breaking existing code, especially when what is being "fixed" is a
minor issue which almost nobody complains about.

This is even truer for stable branches, and Python 2 is very much a
stable branch now (no more feature releases after 2.7).

> That's why I say that you are of conformance by having encoders and decoders of UTF
> streams tolerate noncharacters.  You are not allowed to call something a UTF and do
> non-UTF things with it, because this in violation of conformance requirement C2.

Perhaps, but it is not Python's fault if the IETF and the Unicode
consortium have disagreed on what UTF-8 should be. I'm not sure what
people called "UTF-8" when support for it was first introduced in
Python, but you can't blame us for maintaining a consistent behaviour
across releases.
History
Date User Action Args
2011-08-14 17:36:42pitrousetrecipients: + pitrou, terry.reedy, jkloth, ezio.melotti, mrabarnett, Arfrever, r.david.murray, tchrist
2011-08-14 17:36:42pitroulinkissue12729 messages
2011-08-14 17:36:41pitroucreate