What would you have it do in the general case, should it concatenate all the text in:

>>> root4 = ET.fromstring('<a>abc<b>def</b>ghi</a>')
>>> root4.text

If I'm interpreting the XML spec correctly ( section [43]), the optional character data must be a the beginning of the element before any other elements, comments, or processing instructions:

	content	   ::=   	CharData? ((element | Reference | CDSect | PI | Comment) CharData?)*

In other words, I'm not sure your XML is considered well-formed.
