classification
Title: provide pretty printer for xml.etree.ElementTree
Type: enhancement Stage: resolved
Components: Library (Lib), XML Versions: Python 3.4
process
Status: closed Resolution: duplicate
Dependencies: Superseder: xml.etree.ElementTree: add feature to prettify XML output
View: 14465
Assigned To: Nosy List: alex.henderson, eli.bendersky, eric.snow, lpapp
Priority: low Keywords: patch

Created on 2013-03-07 07:18 by eric.snow, last changed 2013-08-05 16:45 by lpapp. This issue is now closed.

Files
File name Uploaded Description Edit
issue17372.patch alex.henderson, 2013-07-07 14:23 An implementation of XML pretty-printing for ElementTree review
Messages (6)
msg183635 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-03-07 07:18
minidom already has something like this:

http://docs.python.org/3/library/xml.dom.minidom.html#xml.dom.minidom.Node.toprettyxml

This is something that appears to have almost made it in quite a while ago:

http://effbot.org/zone/element-lib.htm#prettyprint
http://svn.effbot.org/public/stuff/sandbox/elementlib/indent.py

A casual search found several similar implementations out there.  For instance:

http://infix.se/2007/02/06/gentlemen-indent-your-xml
msg192558 - (view) Author: Alex Henderson (alex.henderson) * Date: 2013-07-07 14:23
I have attached a proposed patch.

This makes some design decisions which I would like someone to review:
 a) To incorporate pretty-printing into the main write() method rather than adding a separate toprettyxml() method. Disadvantages: greater complexity of _serialize_xml(). Advantages: Reduced duplication of code, easy to add other pretty-printing (eg HTML) in the same way.
 b) Existing whitespace on the ends of existing text is mutated. Disadvantages: existing whitespace content may get changed. Advantages: Greater readability (which is the whole point), idempotence of pretty-printing.
 c) Not to add a trailing newline. I am undecided as to whether this is a bad idea or a good one, but am documenting it to ensure it gets visibility.

Of these, I think b) is the only potentially controversial one, and notably its behaviour differs from minidom's toprettyxml. I think it's the right thing to do though; and for the cases where whitespace is important, perhaps we can respect the xml:space attribute when pretty-printing?
http://www.w3.org/TR/xml/#sec-white-space

If these design choices are deemed suitable I'm happy to update the patch to support pretty-printing HTML also.
msg192562 - (view) Author: Alex Henderson (alex.henderson) * Date: 2013-07-07 14:35
One other design decision - currently it doesn't deal with the indentation of comments or processing instructions: it leaves them unindented. Should they be indented the same as other tags?
msg194062 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-01 12:54
Thanks for the report (Eric) and the patch (Alex). There are currently some open bugs we need to handle first, so this is somewhat lower priority. I hope to get to it before the 3.4 release.
msg194312 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-03 22:25
I've noticed this is a duplicate of issue #14465. Closing it - let's continue the discussion there, when the time comes.
msg194491 - (view) Author: Laszlo Papp (lpapp) Date: 2013-08-05 16:45
This has just made me switching away from xml.etree.ElementTree today, sadly.

What a pity; it would have been all kind of cool to stick to this minimal, otherwise working parser and builder.
History
Date User Action Args
2013-08-05 16:45:28lpappsetnosy: + lpapp
messages: + msg194491
2013-08-03 22:25:43eli.benderskysetstatus: open -> closed
superseder: xml.etree.ElementTree: add feature to prettify XML output
messages: + msg194312

resolution: duplicate
stage: needs patch -> resolved
2013-08-01 12:54:50eli.benderskysetmessages: + msg194062
2013-07-07 15:35:24serhiy.storchakasetnosy: + eli.bendersky
2013-07-07 14:35:32alex.hendersonsetmessages: + msg192562
2013-07-07 14:23:14alex.hendersonsetfiles: + issue17372.patch

nosy: + alex.henderson
messages: + msg192558

keywords: + patch
2013-03-07 07:18:45eric.snowcreate