This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Improve ElementPath
Type: enhancement Stage: resolved
Components: Library (Lib), XML Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: eli.bendersky, scoder, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-09-30 08:16 by scoder, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 3835 merged scoder, 2017-09-30 08:31
Messages (7)
msg303400 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2017-09-30 08:16
* Allow whitespace around predicate parts, i.e. "[a = 'text']" instead of requiring the less readable "[a='text']".

* Add support for text comparison of the current node, like "[.='text']".

Both currently raise "invalid path" exceptions. PR coming.
msg303402 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-30 08:54
Is this feature already implemented in lxml? Is it a part of some wider standard?
msg303403 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2017-09-30 09:03
Well, there's XPath for a standard:
https://www.w3.org/TR/xpath/

ElementPath deviates from it in its namespace syntax (it allows "{ns}tag" where XPath requires "p:tag" prefixes), but that's about it. All other differences are basically needless limitations of ElementPath.

In fact, I had noticed these two limitations in lxml, so I implemented them for the next release. And since ElementPath in ElementTree is still mostly the same as ElementPath in lxml, here's the same thing for ET.
msg303404 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-30 09:39
I think the break in the loop for [.='text'] is not correct.

>>> from xml.etree import ElementTree as ET
>>> e = ET.XML('<root><a><b>text</b></a><a><b></b></a><a><b>text</b></a></root>')
>>> list(e.findall('.//a[b="text"]'))
[<Element 'a' at 0x7ffadb305d58>, <Element 'a' at 0x7ffadb305f58>]
>>> list(e.findall('.//a[.="text"]'))
[<Element 'a' at 0x7ffadb305d58>]

I expect that findall() finds all matched elements, not just the first one. Both above requests should return the same result.
msg303405 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2017-09-30 09:50
Thanks for noticing. I added a test and fixed it.
msg303408 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-30 13:35
New changeset 101a5e84acbab9d880e150195f23185dfb5449a9 by Serhiy Storchaka (scoder) in branch 'master':
bpo-31648: Improve ElementPath (#3835)
https://github.com/python/cpython/commit/101a5e84acbab9d880e150195f23185dfb5449a9
msg303409 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-30 13:36
Thank you for your contribution Stefan! Good improvement.
History
Date User Action Args
2022-04-11 14:58:53adminsetgithub: 75829
2017-09-30 13:36:57serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg303409

stage: patch review -> resolved
2017-09-30 13:35:26serhiy.storchakasetmessages: + msg303408
2017-09-30 09:50:16scodersetmessages: + msg303405
2017-09-30 09:39:00serhiy.storchakasetmessages: + msg303404
2017-09-30 09:03:24scodersetmessages: + msg303403
2017-09-30 08:54:02serhiy.storchakasetmessages: + msg303402
2017-09-30 08:31:52scodersetkeywords: + patch
stage: patch review
pull_requests: + pull_request3816
2017-09-30 08:16:42scodercreate