classification
Title: minidom removeAttributeNode returns None
Type: enhancement Stage: resolved
Components: Library (Lib), XML Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: fdrake, iter, ned.deily, r.david.murray, serhiy.storchaka, terry.reedy
Priority: normal Keywords: patch

Created on 2018-04-13 18:36 by iter, last changed 2018-06-07 19:31 by iter. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 7465 merged iter, 2018-06-07 03:04
PR 6462 iter, 2018-06-07 03:23
Messages (19)
msg315253 - (view) Author: Ari Krupnik (iter) * Date: 2018-04-13 18:36
W3C DOM Level 1[1] requires removeAttributeNode() to return the removed node:

removeAttributeNode: Removes the specified attribute.
Return Value: The Attr node that was removed.

Minidom implementation returns None.

[1]https://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/level-one-core.html#method-removeAttributeNode
msg315544 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-04-21 00:31
It is standard in the Python stdlib that mutation methods usually return None and never echo an input argument.  If one can pass a node to element.removeAttributeNode(node), there is no need to echo it back.  So I suspect that the current behavior is intended.

David, is there a general (perhaps unwritten) rule about how Python translates such functions?

The other 'remove' methods also default to returning None.  Same for the 'set' methods.  All methods would need review before changing just one.
msg315546 - (view) Author: Ari Krupnik (iter) * Date: 2018-04-21 01:17
I guess the main question if whether minidom wants to adhere to the standard or be pythonic, and it's not up to me to decide, although personally I like standards.

The common use case for DOM functions returning the relevant nodes is for the caller to chain calls, e.g.,

e1.setAttributeNode(e0.removeAttributeNode(e0.getAttributeNode("a")))

instead of

a=e0.getAttributeNode("foo")
e0.removeAttributeNode(a)
e1.setAttributeNode(a)
msg315819 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2018-04-26 21:04
The rule in Python is that when a function mutates its argument it does *not* return a pointer to the object it just mutated, and likewise a method does *not* return a pointer to the parent object.  Usually this means a mutating function/method return None, but there are a number of exceptions (dict.pop(), for example).

If I understand correctly, in this case the method is returning the object that is its argument, and *that* object is not changed.  So I'd say this is an edge case.  Given the existence of a standard, I think it would probably be reasonable to implement it that way.  You could think of it as analogous to list.pop(N), although in that case the argument is not what is returned, but rather the value.  I think it is still analogous, though, since the way you identify the node to pop in this case is by its pointer.

In the absence of a standard I'd be inclined to say no, by the way.  I generally don't like chaining APIs :)  So, my word should not be considered final on this, it's just my opinion, and "don't echo an argument" is indeed a generally observed rule in Python APIs, as far as I can tell.
msg316480 - (view) Author: Fred L. Drake, Jr. (fdrake) (Python committer) Date: 2018-05-13 21:32
While I've no strong objection to updating to follow the specification, I also don't see any real value here.  The current minidom implementation has been considered sufficient for many years now (if you consider the DOM desirable at all), so the lack of conformance with the W3C spec doesn't seem significant in practice.
msg318526 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-06-03 08:00
Please add a test and document the new feature (add a versionchanged directive in the module documentation, add a news entry and an entry in What's New).
msg318880 - (view) Author: Ari Krupnik (iter) * Date: 2018-06-07 03:15
I added a test case and a News entry per serhiy.storchaka's request.

https://github.com/python/cpython/pull/7465

I agree with fdrake's concerns about DOM's usefulness. DOM is not very Pythonic. I note that as long as Python has a DOM implementation, it follows the spec fairly closely. Element.removeAttributeNode() is the exception. All other minidom mutators (unpythonically) return the values that the standard requires, including the sister mutator setAttributeNode: 

Node.insertBefore()
Node.appendChild()
Node.replaceChild()
Node.removeChild()

NamedNodeMap.removeNamedItem()
NamedNodeMap.removeNamedItemNS()
NamedNodeMap.setNamedItem()

Element.setAttributeNode()

Document.removeChild()
Document.renameNode()
msg318881 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-06-07 03:24
I unlinked bad PR 6462 (against 3.6). PR 7465 is properly formed and should be reviewed.
msg318883 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-06-07 03:27
2.7 does not get enhancements.  When a core dev removes a version, don't re-add without discussion.
msg318884 - (view) Author: Ari Krupnik (iter) * Date: 2018-06-07 03:32
My bad. This issue looks like a simple omission to me--albeit one that's been in the code a long time. My patch simply brings the code into compliance with what the documentation (including in 2.7) already says it does.
msg318886 - (view) Author: Fred L. Drake, Jr. (fdrake) (Python committer) Date: 2018-06-07 03:53
Python 2.7 is in security-fix-only mode, and this doesn't fit that.  While I wouldn't object to a note in the documentation, see my comments in my patch review (there's just no place for it to go).
msg318887 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-06-07 03:57
Fred:
> Python 2.7 is in security-fix-only mode

That's not quite the case yet.  We are still accepting bug-fixes for 2.7 although its end-of-life *is* approaching!

https://devguide.python.org/#status-of-python-branches
msg318888 - (view) Author: Fred L. Drake, Jr. (fdrake) (Python committer) Date: 2018-06-07 04:02
I should stop relying on wetware memory; it's not working out.  Sorry for the mis-information.
msg318889 - (view) Author: Fred L. Drake, Jr. (fdrake) (Python committer) Date: 2018-06-07 04:42
New changeset 5bfa058e65897567889354d7eb34af2b93a20f18 by Fred Drake (arikrupnik) in branch 'master':
bpo-33274: Compliance with DOM L1: return removed attribute (#7465)
https://github.com/python/cpython/commit/5bfa058e65897567889354d7eb34af2b93a20f18
msg318897 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-06-07 06:17
Ari, I know that you viewed this as a behavior (bugfix) issue from the start (message 1).  I changed to enhancement (feature request) in message 2 because, as you acknowledged in message 3, it is an edge case.  The problem is classification can be fuzzy, while the action of merge or not is binary.  I prefer enhancement as the initial classification for edge cases because (except for 2.7-only issues) we first need to decide whether a change belongs in master branch before deciding to backport.  Edge cases sometimes make for messy disposition.

If this is viewed as a bugfix, it can (but not 'must') be backported, first to 3.7, second to 3.6, and only third to 2.7.  2.7 is *mainly* but not exclusively still being maintained for security and build issues, with bugfixes at the discretion and judgment of the merge committer.  I personally would only backport to 3.7.0, maybe but likely not to 3.6.6, and definitely not to 2.7, as I would see it as not helping 2.7 users and likely not 3.6 users.  

If this is viewed as an enhancement, then it should at least have, as Serhiy noted, a version-changed notice in the doc.

Since Fred merged this without such a notice, but did not backport, he effectively treated this as a master-branch only bugfix.  We sometime do this, such as with some exception messages.  It would be fine with me to close it as such.  

Or we could backport this to 3.7.0rc1 (but not 3.7.1) and call this a x.y.0-only bugfix.  Either way, I see the reason as being the same as for not backporting exception message changes to maintenance releases: the bugfix is not worth introducing dependence on a particular maintenance release.
msg318943 - (view) Author: Ari Krupnik (iter) * Date: 2018-06-07 14:38
I feel a little bit like I wandered into a card game whose rules I didn't understand. I'm just a lay, mortal user. I've been writing Python for 15 years, first time I saw an opportunity to contribute back. I saw what looked to me like a bug that's been in the code for 18 years, and I saw that it was a simple fix. You want a News entry? I'm happy to write a news entry. You want a full stop at the end? I'm happy to make that commit. You want me to write something in the doc? I'm happy to do that. What you do with my contribution is your call. I don't make the rules, like I said, I don't even understand them so well.


When I first found this bug, I saw it as a very low-risk fix in terms of API change. It seems less likely that someone somewhere has code that depends on this function that always returning None. Python makes a very hard distinction between statements and expressions, and like you're saying, the Pythonic assumption is that a mutator is a statement whose return value one doesn't check. If this function returned a value, but the value were different from the spec, it would be a higher-risk change.


As a side note, I'm a 2.7 user, so would benefit from backporting this fix.


All this said, you're the maintainer, I'm the user, you don't have to justify your decisions to me. If you decide against backporting, I encourage you to update the documentation in earlier versions to state explicitly that this one mutator in the module diverges from the standard which the module otherwise implements faithfully.
msg318953 - (view) Author: Fred L. Drake, Jr. (fdrake) (Python committer) Date: 2018-06-07 16:14
> I saw what looked to me like a bug that's been in the code for 18 years,
> and I saw that it was a simple fix.

And you're right:  It is a bug, the fix is simple, and the risk is low.

Ten years ago, I'd have probably just landed the fix on all applicable branches and moved on.  But the longer something stays the same (like you said, for 18 years now), the more surprising a fix can be when something (however foolishly and unlikely it seems) relied on the old, broken behavior falls apart because of the change.

Please don't think we don't appreciate your contribution; we do.  I hope you'll continue to be observant and catch bugs (even little ones!), and provide fixes.

Feel free to reach out if you'd like to discuss this further.
msg318960 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-06-07 18:46
Let me clarify part of what I wrote.  If we make this change in 2.7.16, then any code that uses the return value will not run on anything previous.  This may not matter for you, but would be bad for public code.  This cost is present in all bug fixes, but usually there is the benefit that existing code will run better without being changed, or that it becomes possible to get certain results that were not possible before.  This code does not have those benefits.
msg318965 - (view) Author: Ari Krupnik (iter) * Date: 2018-06-07 19:31
In retrospect, I wish I had submitted this as a documentation change, simply acknowledging this function's behavior.

"In most computer projects there comes a day when it is discovered that the machine and the manual don't agree. When the confrontation follows, the manual usually loses, for it can be changed far more quickly and cheaply than the machine." -- Fred Brooks, MMM, 1974
History
Date User Action Args
2018-06-07 19:31:17itersetmessages: + msg318965
2018-06-07 18:46:16terry.reedysetmessages: + msg318960
2018-06-07 16:14:20fdrakesetstatus: open -> closed
resolution: fixed
messages: + msg318953

stage: patch review -> resolved
2018-06-07 14:38:32itersetmessages: + msg318943
2018-06-07 06:17:25terry.reedysetmessages: + msg318897
2018-06-07 04:42:45fdrakesetmessages: + msg318889
2018-06-07 04:02:38fdrakesetmessages: + msg318888
2018-06-07 03:57:53ned.deilysetnosy: + ned.deily
messages: + msg318887
2018-06-07 03:53:49fdrakesetmessages: + msg318886
2018-06-07 03:32:18itersetmessages: + msg318884
2018-06-07 03:27:15terry.reedysetmessages: + msg318883
versions: - Python 2.7
2018-06-07 03:24:35terry.reedysetmessages: + msg318881
2018-06-07 03:23:32itersetpull_requests: + pull_request7088
2018-06-07 03:20:27terry.reedysetpull_requests: - pull_request6157
2018-06-07 03:19:53itersetpull_requests: - pull_request7086
2018-06-07 03:15:26itersetmessages: + msg318880
versions: + Python 2.7
2018-06-07 03:04:47itersetpull_requests: + pull_request7087
2018-06-07 02:41:15itersetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request7086
2018-06-03 08:00:38serhiy.storchakasetnosy: + serhiy.storchaka

messages: + msg318526
stage: test needed
2018-05-13 21:32:09fdrakesetnosy: + fdrake
messages: + msg316480
2018-04-26 21:04:14r.david.murraysetmessages: + msg315819
2018-04-21 01:17:50itersetmessages: + msg315546
2018-04-21 00:31:15terry.reedysetversions: - Python 2.7
nosy: + r.david.murray, terry.reedy

messages: + msg315544

type: behavior -> enhancement
2018-04-13 18:37:43itersettype: behavior
2018-04-13 18:36:43itercreate