Message 274227 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	vstinner
Date	2016-09-02.10:57:59
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1472813879.44.0.998977479164.issue27940@psf.upfronthosting.co.za>
In-reply-to

Content
The ElementTree module (xml.etree) avoids the XML declaration for "utf-8" and "us-ascii" codecs, but not for the "ascii" encoding. Attached patch avoids the XML declaration for the "ascii" codec since it's a subset of UTF-8 and UTF-8 is the default encoding of XML. The patch also normalizes the encoding name to handle aliases like "utf8" (UTF-8) or "us_ascii" (ASCII). The patch adds unit tests. -- By the way, I'm surprised that the special encoding "unicode" relies on the current locale encoding when the XML declaration is requested. Why not alway susing UTF-8 for unicode instead of the locale encoding? My unit test tests different locale encodings.

The ElementTree module (xml.etree) avoids the XML declaration for "utf-8" and "us-ascii" codecs, but not for the "ascii" encoding.

Attached patch avoids the XML declaration for the "ascii" codec since it's a subset of UTF-8 and UTF-8 is the default encoding of XML.

The patch also normalizes the encoding name to handle aliases like "utf8" (UTF-8) or "us_ascii" (ASCII).

The patch adds unit tests.

--

By the way, I'm surprised that the special encoding "unicode" relies on the *current* locale encoding when the XML declaration is requested. Why not alway susing UTF-8 for *unicode* instead of the locale encoding?

My unit test tests different locale encodings.

History
Date	User	Action	Args
2016-09-02 10:57:59	vstinner	set	recipients: + vstinner
2016-09-02 10:57:59	vstinner	set	messageid: <1472813879.44.0.998977479164.issue27940@psf.upfronthosting.co.za>
2016-09-02 10:57:59	vstinner	link	issue27940 messages
2016-09-02 10:57:59	vstinner	create