This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author scoder
Recipients eli.bendersky, maker, mmokrejs, r.david.murray, scoder, serhiy.storchaka
Date 2013-08-28.10:26:03
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1377685563.46.0.415007843465.issue18850@psf.upfronthosting.co.za>
In-reply-to
Content
We are talking about two different things here.

I said that (serialised) XML is defined as a sequence of bytes. Read the spec on that.

What you are talking about is the Infoset, or the parsed/generated in-memory XML tree. That's obviously not bytes, it's defined based on Unicode. Parsing and serialising does the mapping here.

The "attack" that you presented is based on serialised XML, thus on a sequence of bytes. What I am saying is that this "attack" can be done by any kind of binary data, so it's not XML specific, thus not a problem with ElementTree.

The fact that ElementTree allows you to generate non well-formed 'XML' containing control characters when you tell it to do so is unfortunate, but it's neither a security risk (you already had the non well-formed content in your hands *before* you passed it into ElementTree), nor clearly a bug, because the user specifically requested the serialisation of an in-memory tree that contained these control characters.

But, once again, it would be nice if ElementTree rejected this input in one way or another, and that's a feature request.
History
Date User Action Args
2013-08-28 10:26:03scodersetrecipients: + scoder, mmokrejs, r.david.murray, eli.bendersky, maker, serhiy.storchaka
2013-08-28 10:26:03scodersetmessageid: <1377685563.46.0.415007843465.issue18850@psf.upfronthosting.co.za>
2013-08-28 10:26:03scoderlinkissue18850 messages
2013-08-28 10:26:03scodercreate