Author scoder
Recipients cbz, christian.heimes, effbot, eli.bendersky, flox, loewis, scoder, serhiy.storchaka
Date 2019-04-26.09:00:04
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1556269204.39.0.784282272383.issue13611@roundup.psfhosted.org>
In-reply-to
Content
Turns out, it was not that easy. :-/

ElementTree lacks prefixes in its tree model, so they would have to be either registered globally (via register_namespace()) or come from the parser. I tried the latter since that is the most generic way when the input is serialised already. See issue 36673 and issue 36676 for extensions to the parser target interface that this implementation relies on. Note that this is a new implementation, only marginally based off the original ElementC14N implementation.

I only implemented C14N 2.0 (which lxml also does not have, but I'll add it there). I got most of the official test cases working, including prefix rewriting and prefix resolution in tag and attribute content.

https://www.w3.org/TR/xml-c14n2-testcases/

What's not supported?

The original namespace prefixes may not be preserved when namespaces are declared with multiple prefixes. In that case, one of them is picked. That's difficult to implement in ET because the parser resolves and discards prefixes. I think that's acceptable, as long as the prefix selection is deterministic.

Also, qname rewriting in XPath expressions that appear in XML text is not currently supported. I guess that's a bit of an esoteric feature which can still be added later if it's needed.

While testing, I noticed that ET and cET behave differently when it comes to resolving default attributes from an internal DTD subset. The parser in cET does it, ET does not. That should probably get aligned. For now, the tests hack around that difference.

Comments and reviews welcome.
History
Date User Action Args
2019-04-26 09:00:04scodersetrecipients: + scoder, loewis, effbot, christian.heimes, eli.bendersky, flox, serhiy.storchaka, cbz
2019-04-26 09:00:04scodersetmessageid: <1556269204.39.0.784282272383.issue13611@roundup.psfhosted.org>
2019-04-26 09:00:04scoderlinkissue13611 messages
2019-04-26 09:00:04scodercreate