This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Artoria2e5
Recipients Artoria2e5, ezio.melotti, lemburg, loewis, malin, vstinner
Date 2016-10-03.03:50:56
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1475466657.29.0.442621050122.issue24036@psf.upfronthosting.co.za>
In-reply-to
Content
> Advice for final user:

This seems something worthy of adding to the codecs doc as a footnote. Perhaps something like "(deprecated) ... gb2312 is an obsolete encoding from the 1980s. Use gbk or gb18030 instead." will do.

> libiconv-1.14 is also using the wrong version.

Just a side note on the right/wrongfulness of libiconv: I have reported the GB18030 incompatibility as a libiconv bug.[1] From the replies, I learnt that 1) what libiconv is using currently is a then-official mapping published on ftp.unicode.org; 2) vendor implementations of gb2312 differed historically. I have updated the corresponding section[2] on Wikipedia to include these old references.
  [1]: https://lists.gnu.org/archive/html/bug-gnu-libiconv/2016-09/msg00004.html
  [2]: https://en.wikipedia.org/wiki/GB_2312#Two_implementations_of_GB2312

Still, being old and common does not necessarily mean being correct, as Ma Lin have demonstrated by showing the character semantics. To reflect this in a better-supported manner, I have added names for the glyphs in question from GB2312-80 to [2].
History
Date User Action Args
2016-10-03 03:50:57Artoria2e5setrecipients: + Artoria2e5, lemburg, loewis, vstinner, ezio.melotti, malin
2016-10-03 03:50:57Artoria2e5setmessageid: <1475466657.29.0.442621050122.issue24036@psf.upfronthosting.co.za>
2016-10-03 03:50:57Artoria2e5linkissue24036 messages
2016-10-03 03:50:56Artoria2e5create