classification
Title: Parameter type error for xml.sax.parseString(string, ...)
Type: enhancement Stage: resolved
Components: Library (Lib), XML Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: 2175 17089 Superseder:
Assigned To: serhiy.storchaka Nosy List: Thomas.Ryan, christian.heimes, eli.bendersky, ezio.melotti, georg.brandl, loewis, python-dev, serhiy.storchaka, terry.reedy, tshepang, ygale
Priority: normal Keywords: patch

Created on 2010-11-30 19:37 by Thomas.Ryan, last changed 2015-04-04 07:14 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
sax_parse_3.patch serhiy.storchaka, 2015-04-02 20:36 review
Messages (11)
msg122933 - (view) Author: Thomas Ryan (Thomas.Ryan) Date: 2010-11-30 19:37
In 3.1.3, 3.1.2, maybe earlier...

xml.sax.parseString(string, handler, error_handler=handler.ErrorHandler())

Source code requires bytes, not a string as implied by function name and by the documentation.

Exception thrown for strings.

Since the name includes "string" the source should probably be fixed.
Or at least update the documentation.

Someday replace/augment parseString() with parseBytes()?
msg180045 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-15 19:55
Indeed, xml.dom.minidom.parseString() and xml.etree.ElementTree.fromstring() accepts both bytes and strings, xml.dom.minidom.parse(), xml.etree.ElementTree.parse() and even xml.sax.parse() accepts both byte and text streams. Only xml.sax.parseString() rejects strings in contrast to its name. This looks as 2 to 3 porting bug.
msg180141 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-17 16:50
Here is a patch which fixes this issue and a couple of related issues: issue1483, issue2174, issue2175, issue10590.
msg183408 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-03-03 22:26
I'm not very knowledgeable in other XML modules, but I hate to see this patch linger. Also it's a pre-requisite for #16986, it seems.

Serhiy, since the patch is large could you give a short summary of the things it fixes? Note that the best approach IMHO is to submit and push minimal patches that fix specific issues and not lump several fixes together, unless it doesn't make sense to separate them.
msg183409 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-03-03 22:43
Please hold of any modifications of XML code until we have decided how we are going to fix the XML exploits.

Also I think this is a new feature and not a fix. parseString() is documented as 'parses from a buffer string'. It doesn't say that it can parse text.
msg183455 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-03-04 14:04
Low-level part already extracted to issue17089 and committed. Issue16986 has a similar patch for cElementTree. The main part of path was moved to issue2175 which is now pre-requisite for issue16986 and for this issue. It contains additional tests and additional fixes. It is hard and little sense to split them on separated patches. Let's move the discussion to issue2175.

And then the patch for this issue will be small and simple, only several lines and one test. At least this issue is less important and actually can be considered as a new feature.
msg183457 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-03-04 14:18
Serhiy, OK - I'll look at #2175 first. But yes, Christian is right, let's wait for the security issues to be resolved first.
msg231781 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-27 20:59
There was no significant motion in the direction of fixing XML security issues. May be resolve issue2175 first?
msg239945 - (view) Author: Roundup Robot (python-dev) Date: 2015-04-02 20:22
New changeset 3ac1b21fbb42 by Serhiy Storchaka in branch '2.7':
Issue #10590: Added tests for xml.sax.parse() and xml.sax.parseString().
https://hg.python.org/cpython/rev/3ac1b21fbb42

New changeset ca8666310eb3 by Serhiy Storchaka in branch '3.4':
Issue #10590: Added tests for xml.sax.parse() and xml.sax.parseString().
https://hg.python.org/cpython/rev/ca8666310eb3

New changeset 846c165cf643 by Serhiy Storchaka in branch 'default':
Issue #10590: Added tests for xml.sax.parse() and xml.sax.parseString().
https://hg.python.org/cpython/rev/846c165cf643
msg239946 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-04-02 20:25
After resolving issue2175 and committing tests that works with current code, only minimum of changes are left. Here is a patch that adds support of string argument in xml.sax.parseString().
msg240048 - (view) Author: Roundup Robot (python-dev) Date: 2015-04-04 07:14
New changeset fca669149d8a by Serhiy Storchaka in branch 'default':
Issue #10590: xml.sax.parseString() now supports string argument.
https://hg.python.org/cpython/rev/fca669149d8a
History
Date User Action Args
2015-04-04 07:14:44serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2015-04-04 07:14:01python-devsetmessages: + msg240048
2015-04-02 20:36:02serhiy.storchakasetfiles: + sax_parse_3.patch
2015-04-02 20:25:07serhiy.storchakasetstage: needs patch -> patch review
messages: + msg239946
components: + Library (Lib), - Unicode
versions: + Python 3.5, - Python 3.4
2015-04-02 20:22:10python-devsetnosy: + python-dev
messages: + msg239945
2014-12-31 16:19:00akuchlingsetnosy: - akuchling
2014-11-27 20:59:23serhiy.storchakasetmessages: + msg231781
2014-02-03 18:39:43BreamoreBoysetnosy: - BreamoreBoy
2013-03-04 14:18:43eli.benderskysetmessages: + msg183457
2013-03-04 14:06:40serhiy.storchakasetfiles: - sax_parse.patch
2013-03-04 14:04:23serhiy.storchakasettype: behavior -> enhancement
stage: patch review -> needs patch
messages: + msg183455
versions: - Python 3.2, Python 3.3
2013-03-03 22:43:59christian.heimessetnosy: + christian.heimes
messages: + msg183409
2013-03-03 22:26:40eli.benderskysetnosy: + eli.bendersky
messages: + msg183408
2013-02-25 15:41:05serhiy.storchakaunlinkissue16986 dependencies
2013-02-13 17:52:22serhiy.storchakasetdependencies: + Expat sax parser silently ignores the InputSource protocol
2013-01-31 10:05:47serhiy.storchakasetdependencies: + Expat parser parses strings only when XML encoding is UTF-8
2013-01-19 15:56:30eli.benderskysetnosy: - eli.bendersky
2013-01-17 16:54:46serhiy.storchakalinkissue16986 dependencies
2013-01-17 16:50:22serhiy.storchakasetfiles: + sax_parse.patch

nosy: + loewis, akuchling, georg.brandl, terry.reedy, ygale, eli.bendersky, BreamoreBoy, tshepang
messages: + msg180141

keywords: + patch
stage: needs patch -> patch review
2013-01-16 20:48:08serhiy.storchakasetassignee: serhiy.storchaka
2013-01-15 19:55:00serhiy.storchakasetversions: + Python 3.2, Python 3.4
nosy: + ezio.melotti, serhiy.storchaka

messages: + msg180045

components: + Unicode
stage: needs patch
2010-11-30 19:37:35Thomas.Ryancreate