This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Romuald
Recipients Romuald
Date 2021-04-02.11:03:35
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1617361415.63.0.134050826709.issue43703@roundup.psfhosted.org>
In-reply-to
Content
Python XML parser (xml.etree) does not seems to allow control characters that are invalid in XML 1.0, but valid in XML 1.1 [1] [2]


Considering the following sample:


import xml.etree.ElementTree as ET

bad = '<?xml version="1.1"?><foo>bar &#x19; baz</foo>'
print(ET.fromstring(bad))


The parser raises the following error:
ParseError: reference to invalid character number: line 1, column 30



[1] https://www.w3.org/TR/xml11/Overview.html#charsets
[2] https://www.w3.org/TR/xml11/Overview.html#sec-xml11
History
Date User Action Args
2021-04-02 11:03:35Romualdsetrecipients: + Romuald
2021-04-02 11:03:35Romualdsetmessageid: <1617361415.63.0.134050826709.issue43703@roundup.psfhosted.org>
2021-04-02 11:03:35Romualdlinkissue43703 messages
2021-04-02 11:03:35Romualdcreate