classification
Title: xml.etree.ElementTree.Element does not catch text
Type: behavior Stage: resolved
Components: Versions: Python 3.4
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: eli.bendersky, jlaurens, ned.deily, rhettinger, scoder
Priority: normal Keywords:

Created on 2015-04-29 00:42 by jlaurens, last changed 2015-04-29 02:39 by ned.deily. This issue is now closed.

Messages (2)
msg242207 - (view) Author: Jérôme Laurens (jlaurens) Date: 2015-04-29 00:42
text is not catcher in case 3 below

INPUT

import xml.etree.ElementTree as ET
root1 = ET.fromstring('<a>TEXT</a>')
print(root1.text)
root2 = ET.fromstring('<a>TEXT<b/></a>')
print(root2.text)
root3 = ET.fromstring('<a><b/>TEXT</a>')
print(root3.text)

CURRENT OUTPUT

TEXT
TEXT
None <---------- ERROR HERE

EXPECTED OUTPUT

TEXT
TEXT
TEXT
msg242209 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2015-04-29 02:39
While a bit confusing, I don't think this is a bug.  Note the definition of the "tail" attribute of an element:

"If the element is created from an XML file the attribute will contain any text found after the element’s end tag and before the next tag."

Unlike in root1 and root2 where 'TEXT' is before the end of element a and the start of element b (in root2), 'TEXT' in root3 follows the end tag of element b and so is associated with it as its tail attribute.

>>> root3
<Element 'a' at 0x1022ab188>
>>> root3[0]
<Element 'b' at 0x1022ab1d8>
>>> root3[0].tail
'TEXT'

https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.tail
History
Date User Action Args
2015-04-29 02:39:48ned.deilysetstatus: open -> closed

nosy: + ned.deily
messages: + msg242209

resolution: not a bug
stage: resolved
2015-04-29 01:56:00rhettingersetmessages: - msg242208
2015-04-29 01:52:35rhettingersetnosy: + scoder, eli.bendersky
2015-04-29 01:51:47rhettingersetnosy: + rhettinger
messages: + msg242208
2015-04-29 00:42:01jlaurenscreate