This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Poor example for Element.remove()
Type: behavior Stage: resolved
Components: Documentation Versions: Python 3.8
process
Status: closed Resolution: duplicate
Dependencies: Superseder:
Assigned To: docs@python Nosy List: WoodyWoo, docs@python, eric.smith, scoder
Priority: normal Keywords:

Created on 2020-10-01 00:08 by WoodyWoo, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (7)
msg377726 - (view) Author: (WoodyWoo) Date: 2020-10-01 00:08
We can remove elements using Element.remove(). Let’s say we want to remove all countries with a rank higher than 50:

>>>
>>> for country in root.findall('country'):
...     rank = int(country.find('rank').text)
...     if rank > 50:
...         root.remove(country)
...
>>> tree.write('output.xml')

When the original xml has over 2 country with  rank>50,and they are one by one neighborly siblings element,the upper code will delete the 1st 3rd 5th and more odd No. country.
A proper example should be:
index=0
while index < len(root.findall("./*")):
    rank = int (root[index].find("rank").text)
    if rank>50:
        root.remove(root[index])
        index=index+0
        continue
    index=index+1


I think "for each in list" should not work by index,but should work by pointer like thing,could be a list of pointers.
A finial solution should be like this --- when the "for each in list" was acting,the pointers list would be fixed,and you need not to worry about the "list" changing.
msg377728 - (view) Author: (WoodyWoo) Date: 2020-10-01 00:25
#new code to avoid an err
index=0
count=len(root.findall("./*"))
while index <count:
    rank = int (root[index].find("rank").text)
    if rank>50:
        root.remove(root[index])
        index=index+0
        count=count-1  # avoid index err
        continue
    index=index+1
msg377729 - (view) Author: (WoodyWoo) Date: 2020-10-01 01:11
My fault.
"for country in root.findall('country')“ is not working as same as "for country in root"
all 3 method below:

import xml.etree.ElementTree as ET 
xmlstr=\
r'''<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama1">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
    <country name="Panama2">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>'''
print(xmlstr)

#orginal code
root = ET.fromstring(xmlstr)
for country in root.findall('country'):
    rank = int(country.find('rank').text)
    if rank > 50:
        root.remove(country)
print("___orginal___")
for country in root.findall('country'):
        print (country.get("name"))
print("^^^orginal^^^^\n")

#wrong code in my mind
root = ET.fromstring(xmlstr)
for country in root:
    rank = int(country.find('rank').text)
    if rank > 50:
        root.remove(country)
print("___bad___")
for country in root.findall('country'):
        print (country.get("name"))
print("^^^bad^^^^\n")

#my code
root = ET.fromstring(xmlstr)
index=0
count=len(root.findall("./*"))
while index <count:
    rank = int (root[index].find("rank").text)
    if rank>50:
        root.remove(root[index])
        index=index+0
        count=count-1  # avoid index err
        continue
    index=index+1
print("___new___")
for country in root.findall('country'):
        print (country.get("name"))
print("^^^new^^^^\n")
msg377730 - (view) Author: (WoodyWoo) Date: 2020-10-01 01:29
The docs should specially tell that when need root.remove(child) must works with "for child  in root.findall()".

And my code with "while if continue" could make root.insert(index,newchild) easily.
msg377731 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-10-01 01:30
As you've seen, the example is correct. I made the same mistake earlier today.

For others: see also #41891 for a suggestion to improve the documentation.

As was pointed out in that issue, it's generally true in Python that you should not mutate a sequence while iterating over it.
msg377734 - (view) Author: (WoodyWoo) Date: 2020-10-01 02:34
Could I say the mutable sequence containing not the object but the pointer like C++.
So they can changed in def functions.
msg377741 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-10-01 08:02
Closing as duplicate of issue 41892. Let's keep the discussion there.
History
Date User Action Args
2022-04-11 14:59:36adminsetgithub: 86065
2020-10-01 08:02:07scodersetmessages: + msg377741
2020-10-01 08:01:41scodersetmessages: - msg377740
2020-10-01 08:00:16scodersetstatus: open -> closed

nosy: + scoder
messages: + msg377740

resolution: duplicate
stage: resolved
2020-10-01 02:34:23WoodyWoosetmessages: + msg377734
2020-10-01 01:30:24eric.smithsetnosy: + eric.smith
messages: + msg377731
2020-10-01 01:29:21WoodyWoosetmessages: + msg377730
2020-10-01 01:11:46WoodyWoosetmessages: + msg377729
2020-10-01 00:25:23WoodyWoosetmessages: + msg377728
2020-10-01 00:08:35WoodyWoocreate