classification
Title: The only supported minidom attribute iteration (NamedNodeMap) is O(n^2)
Type: performance Stage:
Components: XML Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: nthykier
Priority: normal Keywords:

Created on 2020-05-19 19:55 by nthykier, last changed 2020-05-19 19:55 by nthykier.

Messages (1)
msg369384 - (view) Author: Niels Thykier (nthykier) Date: 2020-05-19 19:55
Hi,

The only official supported iteration over all attributes on an element via the minidom XML API require an O(n²) iterator.  It happens because the `item` method generates a list of all attribute names, look up the attribute name at the given index and then throws away the list (only to recompute it on next lookup).

There are also a `.values()` method that looks very promising, only it has "strings-attached" in the form of the following disclaimer:

"""There are also experimental methods that give this class more mapping behavior. [...]"""  
(source: https://docs.python.org/3/library/xml.dom.html)

The word "experimental" makes it hard for me to ask projects to migrate to `.values()` because I have to convince them to accept the risk of adopting the "unsupported APIs".


For me, any of the following would solve the issue:

 * Bless the mapping based API as supported to the same extend as `item` + `length` in the DOM API.

 * Optimize `item` to avoid the performance issue.

 * Provide an alternative (but supported) way of iterating over all attributes.  Preferably one that enables you to get the node directly without having to use `Element.getAttributeNode()` or similar.



The code in question highlighting the problematic code in the minidom API:

```
class NamedNodeMap(object):

[...]

    def item(self, index):
        try:
            return self[list(self._attrs.keys())[index]]
        except IndexError:
            return None
```
History
Date User Action Args
2020-05-19 19:55:58nthykiercreate