Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let ElementTree prolog include comments and processing instructions #68475

Open
rhettinger opened this issue May 26, 2015 · 4 comments
Open

Let ElementTree prolog include comments and processing instructions #68475

rhettinger opened this issue May 26, 2015 · 4 comments
Labels
topic-XML type-feature A feature request or enhancement

Comments

@rhettinger
Copy link
Contributor

BPO 24287
Nosy @rhettinger, @scoder, @vadmium
Superseder
  • bpo-9521: xml.etree.ElementTree skips processing instructions when parsing
  • Files
  • xml_prolog.diff: Very rough draft patch.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2015-05-26.00:54:09.997>
    labels = ['expert-XML', 'type-feature']
    title = 'Let ElementTree prolog include comments and processing instructions'
    updated_at = <Date 2019-04-27.15:55:22.956>
    user = 'https://github.com/rhettinger'

    bugs.python.org fields:

    activity = <Date 2019-04-27.15:55:22.956>
    actor = 'scoder'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['XML']
    creation = <Date 2015-05-26.00:54:09.997>
    creator = 'rhettinger'
    dependencies = []
    files = ['39498']
    hgrepos = []
    issue_num = 24287
    keywords = ['patch']
    message_count = 4.0
    messages = ['244069', '244072', '244085', '340993']
    nosy_count = 4.0
    nosy_names = ['rhettinger', 'scoder', 'eli.bendersky', 'martin.panter']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = '9521'
    type = 'enhancement'
    url = 'https://bugs.python.org/issue24287'
    versions = ['Python 3.6']

    @rhettinger
    Copy link
    Contributor Author

    Currently, ElementTree doesn't support comments and processing instructions in the prolog. That is the typical place to put style-sheets and document type definitions.

    It would be used like this:

        from xml.etree.ElementTree import ElementTree, Element, Comment, ProcessingInstruction
    
        r = Element('current_observation', version='1.0')
        r.text = 'Nothing to see here.  Move along.'
        t = ElementTree(r)
        t.append(ProcessingInstruction('xml-stylesheet', 'href="latest_ob.xsl" type="text/xsl"'))
        t.append(Comment('Published at: http://w1.weather.gov/xml/current_obs/KSJC.xml'))

    That creates output like this:

    <?xml version='1.0' encoding='utf-8'?>
    <?xml-stylesheet href="latest_ob.xsl" type="text/xsl"?>
    <!--Published at: http://w1.weather.gov/xml/current_obs/KSJC.xml-->
    <current_observation version="1.0">
    Nothing to see here.  Move along.
    </current_observation>
    

    @rhettinger rhettinger added the type-feature A feature request or enhancement label May 26, 2015
    @vadmium
    Copy link
    Member

    vadmium commented May 26, 2015

    The ElementTree class imitates or wraps many methods of the Element class. Since Element.append() and remove() already exist and act on children of the element, I think the new ElementTree methods should be named differently. Maybe something like prolog_append() and prolog_remove()? Or prologue_append() depending on your spelling preferences :P

    Also, maybe the new write() calls should add newlines.

    @scoder
    Copy link
    Contributor

    scoder commented May 26, 2015

    FTR, lxml's Element class has addnext() and addprevious() methods which are commonly used for this purpose. But ET can't adopt those due to its different tree model.

    I second Martin's comment that ET.append() is a misleading name. It suggests adding stuff to the end, whereas things are actually being inserted before the root element here.

    I do agree, however, that this is a helpful feature and that the ElementTree class is a good place to expose it. I propose to provide a "prolog" (that's the spec's spelling) property holding a list that users can fill and modify any way they wish. The serialiser would then validate that all content is proper XML prologue content, and would serialise it in order.

    My guess is that lxml would eventually use a MutableSequence here that maps changes directly to the underlying tree (and would thus validate them during modification), but ET can be more lenient, just like it allows arbitrary objects in the text and tail properties which only the serialiser rejects.

    Note that newlines can easily be generated on user side by setting the tail of a PI/Comment to "\n". (The serialiser must also validate that the tail text is only allowed whitespace.)

    For reference:

    http://www.w3.org/TR/REC-xml/#sec-prolog-dtd

    @scoder
    Copy link
    Contributor

    scoder commented Apr 27, 2019

    This is a duplicate of 9521, but it's difficult to say which ticket is better.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-XML type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants