Message 242257 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	ned.deily
Recipients	docs@python, jlaurens, ned.deily
Date	2015-04-30.02:35:05
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1430361308.06.0.832779387317.issue24079@psf.upfronthosting.co.za>
In-reply-to

Content
(This issue is a followup to your Issue24072.) Again, while the ElementTree documentation is certainly not nearly as complete as it should be, I don't think this is a documentation error per se. The key issue is: with which element is each text string associated? Perhaps this example will help: >>> root4 = ET.fromstring('<a>ATEXT<b>BTEXT</b>BTAIL</a>') >>> root4 <Element 'a' at 0x10224c228> >>> root4.text 'ATEXT' >>> root4.tail >>> root4[0] <Element 'b' at 0x1022ab278> >>> root4[0].text 'BTEXT' >>> root4[0].tail 'BTAIL' As in your original example, any text following the element b is associated with b's tail attribute until a new tag is found, pushing or popping the tree stack. While the description of the "text" attribute does not explicitly state this, the "tail" attribute description immediately following it does. This is also explained in more detail in the ElementTree resources on effbot.org that are linked to from the Python Standard Library documentation. Nevertheless, it probably would be helpful to expand the documentation on this point if someone is willing to put together a documentation patch for review. With regard to your comment about "well formed xml", I don't think there is anything in the documentation that implies (or should imply) that the distinction between the "text" attribute and the "tail" attribute has anything to do with whether it is well-formed XML. The tutorial for the third-party lxml package, which provides another implementation of ElementTree, goes into more detail about why, in general, both "text" and "tail" are necessary. https://docs.python.org/3/library/xml.etree.elementtree.html#additional-resources http://effbot.org/zone/element.htm#text-content http://lxml.de/tutorial.html#elements-contain-text

(This issue is a followup to your Issue24072.)  Again, while the ElementTree documentation is certainly not nearly as complete as it should be, I don't think this is a documentation error per se.  The key issue is: with which element is each text string associated?  Perhaps this example will help:

>>> root4 = ET.fromstring('<a>ATEXT<b>BTEXT</b>BTAIL</a>')
>>> root4
<Element 'a' at 0x10224c228>
>>> root4.text
'ATEXT'
>>> root4.tail
>>> root4[0]
<Element 'b' at 0x1022ab278>
>>> root4[0].text
'BTEXT'
>>> root4[0].tail
'BTAIL'

As in your original example, any text following the element b is associated with b's tail attribute until a new tag is found, pushing or popping the tree stack.  While the description of the "text" attribute does not explicitly state this, the "tail" attribute description immediately following it does.  This is also explained in more detail in the ElementTree resources on effbot.org that are linked to from the Python Standard Library documentation.  Nevertheless, it probably would be helpful to expand the documentation on this point if someone is willing to put together a documentation patch for review.

With regard to your comment about "well formed xml", I don't think there is anything in the documentation that implies (or should imply) that the distinction between the "text" attribute and the "tail" attribute has anything to do with whether it is well-formed XML.  The tutorial for the third-party lxml package, which provides another implementation of ElementTree, goes into more detail about why, in general, both "text" and "tail" are necessary.

https://docs.python.org/3/library/xml.etree.elementtree.html#additional-resources
http://effbot.org/zone/element.htm#text-content
http://lxml.de/tutorial.html#elements-contain-text

History
Date	User	Action	Args
2015-04-30 02:35:08	ned.deily	set	recipients: + ned.deily, docs@python, jlaurens
2015-04-30 02:35:08	ned.deily	set	messageid: <1430361308.06.0.832779387317.issue24079@psf.upfronthosting.co.za>
2015-04-30 02:35:07	ned.deily	link	issue24079 messages
2015-04-30 02:35:05	ned.deily	create