Title: Expat sax parser silently ignores the InputSource protocol
msg62901 - (view) Author: Yitz Gale (ygale) Date: 2008-02-24 14:03
The expat sax parser in xml.sax.expatreader
does not fully support the InputSource protocol
in xml.sax.xmlreader. It only accepts
byte streams. It ignores the encoding
indicated in the InputStream object and
only uses the encoding read from
the XML or defaults to UTF-8.

Rather than silently doing the wrong thing,
it should raise an error when fed a character stream,
or when given an encoding, via the InputSource

And most importantly, these limitations should be mentioned
in the documentation.
msg62903 - (view) Author: Yitz Gale (ygale) Date: 2008-02-24 14:09
See also: #1483 and #2174.
msg116975 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-09-20 21:23
As nobody appears to be interested I'll close this in a couple of weeks unless someone objects.
msg116984 - (view) Author: Yitz Gale (ygale) Date: 2010-09-20 21:46
Perhaps more people would be interested if
you raise the priority. This bug can cause
serious data corruption, or even crashes.
It should also be tagged as "easy".

An alternative would be to remove the expat
sax parser from the libraries, since we don't
support it. But that seems a little extreme.
msg117170 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-09-23 06:45
I'll have a look.
msg181383 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-02-04 19:46
Here is a patch, which made xml.sax.xmlreader and related utilities to support character stream. A lot of new tests added (including Yitz Gale's tests from issue1483). Some old tests fixed (they were used text stream as byte stream, this doesn't work in general case).
msg182055 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-02-13 17:50
This patch is rather complicated and I doubt whether it is necessary to apply it to the older version. Can anyone review it?
msg231555 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-23 12:11
msg239311 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-26 07:26
Updated to the tip, added whatsnew entry and fixed the documentation.

What parts of this patch besides tests are worth to be applied to maintained releases?
msg239936 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-04-02 18:01
New changeset 84d49ad9109b by Serhiy Storchaka in branch '2.7':
Issue #2175: Added tests for xml.sax.saxutils.prepare_input_source().

New changeset fa47897e7889 by Serhiy Storchaka in branch '3.4':
Issue #2175: Added tests for xml.sax.saxutils.prepare_input_source().

New changeset e0292b3ba245 by Serhiy Storchaka in branch 'default':
Issue #2175: Added tests for xml.sax.saxutils.prepare_input_source().

New changeset 407883c52bf3 by Serhiy Storchaka in branch 'default':
Issue #2175: SAX parsers now support a character stream of InputSource object.
