msg247404 - (view) |
Author: Eric S (Eric S) |
Date: 2015-07-26 03:35 |
Documentation states:
Element.findall() finds only elements with a tag which are direct children of the current element.
More accurate to say "direct descendents" as "direct children" implies only one generation below whereas function goes down to all g...children. For-looping findall with inner recursion to rebuild hierarchy as dict, etc. can cause very large tree.
|
msg247405 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2015-07-26 04:58 |
The phrase occurs in https://docs.python.org/3/library/xml.etree.elementtree.html#finding-interesting-elements
I think it should say, "In this example, Element.findall() finds only elements with a tag which are direct children of the current element."
Later it accurately says that, "More sophisticated specification of which elements to look for is possible by using XPath."
|
msg247457 - (view) |
Author: Stefan Behnel (scoder) * |
Date: 2015-07-27 09:43 |
Agreed that it's misleading. Note that the section refers to the *following* example, not the one that readers just went through.
I would actually start that paragraph with the reference to XPath (because that's the difference to el.iter()), and then just present the example as the simplest case of such an expression.
|
msg247700 - (view) |
Author: Eric S (Eric S) |
Date: 2015-07-30 22:10 |
Pointing to XPath and clarifying the example reference are good ideas.
For me though, the phrase "direct children" would still lead me to believe that findall() would only give me the first generation right below the element (e.g. only the countries in the example), not grandchildren, greatgrandchildren, etc. That's why I think "direct descendents" or "all descendents" would give a clearer picture.
|
msg247717 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-07-31 00:47 |
How about something like:
'''
:meth:`Element.findall` finds all elements matching a path expression; in this example, all <country> children of *root*. meth:`Element.find` finds the *first* matching element, in this example the first <rank> child. . . .
'''
Also, perhaps it would be good to change findall() to iterfind(), which is a newer API.
|
msg247738 - (view) |
Author: Stefan Behnel (scoder) * |
Date: 2015-07-31 06:21 |
Here's a complete drop-in replacement.
"""
More specific constraints on which elements to look for and where to look for them in the tree can be expressed in :ref:`XPath <elementtree-xpath>`. :meth:`Element.iterfind` iterates over all elements matching such a path expression. In the following example, it finds all direct <country> children of *root*. :meth:`Element.find` provides a shortcut to find only the *first* matching element, in this example the first <rank> child. Once an element is found, :attr:`Element.text` accesses the element's immediate text content and :meth:`Element.get` accesses the element's attributes.
::
>>> for country in root.iterfind('country'):
... rank = country.find('rank').text
... name = country.get('name')
... print(name, rank)
...
Liechtenstein 1
Singapore 4
Panama 68
"""
Note that the reviewed doc patch in issue 24079 also hasn't been applied yet. It would help here.
|
msg247764 - (view) |
Author: Eric S (Eric S) |
Date: 2015-07-31 19:19 |
To my preference, the drop-in is verbose and I got a little confused on first read. The current documentation and example seem mostly OK to me.
If we leave "children" as in "all <country> children of *root*", it doesn't illuminate the fact that if root had a great-great-grandchild also named <country>, then it would return that as well.
The only way I can think to simply clarify it is to use "direct descendents" or "all direct descendents".
Here's how the phrase "direct children" slipped me up when I first read the docs re:findall():
I thought I could rebuild the XML tree like this:
def rebuild_XML_as_whatever(parent)
for child in parent.findall()
code_that_attaches_child
rebuild_XML_as_whatever(child)
instead I had to do this:
def rebuild_XML_as_whatever(parent)
descendns = parent.findall("./")
for child_n in range(len(descendns)):
child = descendns[child_n]
code_that_attaches_child
rebuild_XML_as_whatever(child)
RE:iterfind() if it fits a new format that's fine but renaming would, of course, interfere with backwards compatibility.
|
msg247767 - (view) |
Author: Stefan Behnel (scoder) * |
Date: 2015-07-31 19:57 |
Whenever you feel a need for using range(len(...)), you are almost always doing something wrong. The problem in your code is that you are modifying the tree while iterating over it. However, please ask this question on the mailing list as the bug tracker is not the right place to discuss this.
> If we leave "children" as in "all <country> children of *root*", it doesn't illuminate the fact that if root had a great-great-grandchild also named <country>, then it would return that as well.
No, it wouldn't. It behaves exactly as described. Thus the reference to the XPath section for further explanations.
|
msg247785 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-07-31 23:11 |
Eric: Calling findall("country") does _not_ return grandchidren nor further descendants. Also, your pseudocode calling findall() with no arguments does not work, so I am left wondering where you got the wrong impression about grandchildren. The usual way to get a list of direct children is to call list(parent):
>>> root = XML('''<data><country name="Leichtenstein">
... <country name="Singapore"><country name="Panama"/></country>
... </country></data>''')
>>> root.findall()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Required argument 'path' (pos 1) not found
>>> list(root) # List of direct children
[<Element 'country' at 0xb6ebe324>]
>>> root.findall("country") # List of direct <country> children
[<Element 'country' at 0xb6ebe324>]
>>> root.findall(".//country") # List of all <country> descendants
[<Element 'country' at 0xb6ebe324>, <Element 'country' at 0xb6ebe284>, <Element 'country' at 0xb6ebe0cc>]
|
msg247833 - (view) |
Author: Eric S (Eric S) |
Date: 2015-08-01 22:08 |
Code was intended as example, not request for help to correct, but rushed so example was flawed, but still, I tested and you both are right. Must've had other error in code to cause the xml to dict to have every element map to under every node. Debugger also showed findall and list returns with every descendent under every item as well so that influenced my triage.
Issue is closed as far as I'm concerned but it sounds like you all found something else here.
Sorry for using up your time. Will be more careful and clear if there is a next time.
|
msg350243 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2019-08-23 00:19 |
> Issue is closed as far as I'm concerned
Me too.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:19 | admin | set | github: 68912 |
2019-08-23 00:19:44 | rhettinger | set | status: open -> closed resolution: not a bug messages:
+ msg350243
stage: resolved |
2015-08-01 22:08:44 | Eric S | set | messages:
+ msg247833 |
2015-07-31 23:11:22 | martin.panter | set | messages:
+ msg247785 |
2015-07-31 19:57:59 | scoder | set | messages:
+ msg247767 |
2015-07-31 19:19:30 | Eric S | set | messages:
+ msg247764 |
2015-07-31 06:21:24 | scoder | set | type: enhancement messages:
+ msg247738 versions:
+ Python 2.7, Python 3.4, Python 3.5, Python 3.6 |
2015-07-31 00:47:36 | martin.panter | set | nosy:
+ martin.panter messages:
+ msg247717
|
2015-07-30 22:10:02 | Eric S | set | messages:
+ msg247700 |
2015-07-27 09:43:28 | scoder | set | messages:
+ msg247457 |
2015-07-26 05:42:14 | rhettinger | set | files:
+ etree_clarify.diff keywords:
+ patch |
2015-07-26 04:59:25 | rhettinger | set | nosy:
+ scoder, eli.bendersky components:
+ Documentation
|
2015-07-26 04:58:52 | rhettinger | set | assignee: rhettinger
messages:
+ msg247405 nosy:
+ rhettinger |
2015-07-26 03:35:13 | Eric S | create | |