Issue 40689: The only supported minidom attribute iteration (NamedNodeMap) is O(n^2)

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/84866

classification

Title:	The only supported minidom attribute iteration (NamedNodeMap) is O(n^2)
Type:	performance	Stage:
Components:	XML	Versions:	Python 3.8

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	nthykier
Priority:	normal	Keywords:

Created on 2020-05-19 19:55 by nthykier, last changed 2022-04-11 14:59 by admin.

Messages (1)
msg369384 - (view)	Author: Niels Thykier (nthykier)	Date: 2020-05-19 19:55
Hi, The only official supported iteration over all attributes on an element via the minidom XML API require an O(n²) iterator. It happens because the `item` method generates a list of all attribute names, look up the attribute name at the given index and then throws away the list (only to recompute it on next lookup). There are also a `.values()` method that looks very promising, only it has "strings-attached" in the form of the following disclaimer: """There are also experimental methods that give this class more mapping behavior. [...]""" (source: https://docs.python.org/3/library/xml.dom.html) The word "experimental" makes it hard for me to ask projects to migrate to `.values()` because I have to convince them to accept the risk of adopting the "unsupported APIs". For me, any of the following would solve the issue: * Bless the mapping based API as supported to the same extend as `item` + `length` in the DOM API. * Optimize `item` to avoid the performance issue. * Provide an alternative (but supported) way of iterating over all attributes. Preferably one that enables you to get the node directly without having to use `Element.getAttributeNode()` or similar. The code in question highlighting the problematic code in the minidom API: ``` class NamedNodeMap(object): [...] def item(self, index): try: return self[list(self._attrs.keys())[index]] except IndexError: return None ```

msg369384 - (view)

Author: Niels Thykier (nthykier)

Date: 2020-05-19 19:55

Hi,

The only official supported iteration over all attributes on an element via the minidom XML API require an O(n²) iterator.  It happens because the `item` method generates a list of all attribute names, look up the attribute name at the given index and then throws away the list (only to recompute it on next lookup).

There are also a `.values()` method that looks very promising, only it has "strings-attached" in the form of the following disclaimer:

"""There are also experimental methods that give this class more mapping behavior. [...]"""  
(source: https://docs.python.org/3/library/xml.dom.html)

The word "experimental" makes it hard for me to ask projects to migrate to `.values()` because I have to convince them to accept the risk of adopting the "unsupported APIs".


For me, any of the following would solve the issue:

 * Bless the mapping based API as supported to the same extend as `item` + `length` in the DOM API.

 * Optimize `item` to avoid the performance issue.

 * Provide an alternative (but supported) way of iterating over all attributes.  Preferably one that enables you to get the node directly without having to use `Element.getAttributeNode()` or similar.



The code in question highlighting the problematic code in the minidom API:

```
class NamedNodeMap(object):

[...]

    def item(self, index):
        try:
            return self[list(self._attrs.keys())[index]]
        except IndexError:
            return None
```

History
Date	User	Action	Args
2022-04-11 14:59:31	admin	set	github: 84866
2020-05-19 19:55:58	nthykier	create