classification
Title: Let ElementTree prolog include comments and processing instructions
Type: enhancement Stage:
Components: XML Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder: xml.etree.ElementTree skips processing instructions when parsing
View: 9521
Assigned To: Nosy List: eli.bendersky, martin.panter, rhettinger, scoder
Priority: normal Keywords: patch

Created on 2015-05-26 00:54 by rhettinger, last changed 2019-04-27 15:55 by scoder.

Files
File name Uploaded Description Edit
xml_prolog.diff rhettinger, 2015-05-26 00:54 Very rough draft patch. review
Messages (4)
msg244069 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-05-26 00:54
Currently, ElementTree doesn't support comments and processing instructions in the prolog.  That is the typical place to put style-sheets and document type definitions.

It would be used like this:

    from xml.etree.ElementTree import ElementTree, Element, Comment, ProcessingInstruction

    r = Element('current_observation', version='1.0')
    r.text = 'Nothing to see here.  Move along.'
    t = ElementTree(r)
    t.append(ProcessingInstruction('xml-stylesheet', 'href="latest_ob.xsl" type="text/xsl"'))
    t.append(Comment('Published at: http://w1.weather.gov/xml/current_obs/KSJC.xml'))

That creates output like this:

    <?xml version='1.0' encoding='utf-8'?>
    <?xml-stylesheet href="latest_ob.xsl" type="text/xsl"?>
    <!--Published at: http://w1.weather.gov/xml/current_obs/KSJC.xml-->
    <current_observation version="1.0">
    Nothing to see here.  Move along.
    </current_observation>
msg244072 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-05-26 01:25
The ElementTree class imitates or wraps many methods of the Element class. Since Element.append() and remove() already exist and act on children of the element, I think the new ElementTree methods should be named differently. Maybe something like prolog_append() and prolog_remove()? Or prologue_append() depending on your spelling preferences :P

Also, maybe the new write() calls should add newlines.
msg244085 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2015-05-26 06:43
FTR, lxml's Element class has addnext() and addprevious() methods which are commonly used for this purpose. But ET can't adopt those due to its different tree model.

I second Martin's comment that ET.append() is a misleading name. It suggests adding stuff to the end, whereas things are actually being inserted before the root element here.

I do agree, however, that this is a helpful feature and that the ElementTree class is a good place to expose it. I propose to provide a "prolog" (that's the spec's spelling) property holding a list that users can fill and modify any way they wish. The serialiser would then validate that all content is proper XML prologue content, and would serialise it in order.

My guess is that lxml would eventually use a MutableSequence here that maps changes directly to the underlying tree (and would thus validate them during modification), but ET can be more lenient, just like it allows arbitrary objects in the text and tail properties which only the serialiser rejects.

Note that newlines can easily be generated on user side by setting the tail of a PI/Comment to "\n". (The serialiser must also validate that the tail text is only allowed whitespace.)

For reference:

http://www.w3.org/TR/REC-xml/#sec-prolog-dtd
msg340993 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-04-27 15:53
This is a duplicate of 9521, but it's difficult to say which ticket is better.
History
Date User Action Args
2019-04-27 15:55:22scodersetsuperseder: xml.etree.ElementTree skips processing instructions when parsing
2019-04-27 15:53:59scodersetmessages: + msg340993
2015-05-26 06:43:55scodersetmessages: + msg244085
components: + XML
2015-05-26 01:25:51martin.pantersetnosy: + martin.panter
messages: + msg244072
2015-05-26 00:54:10rhettingercreate