Author karlcow
Recipients karlcow
Date 2020-10-21.02:54:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1603248856.3.0.186912878202.issue42104@roundup.psfhosted.org>
In-reply-to
Content
In XPath 1.0 
The function contains() is available 

> Function: boolean contains(string, string)
> The contains function returns true if the first argument string contains the second argument string, and otherwise returns false.

In https://www.w3.org/TR/1999/REC-xpath-19991116/#function-contains


```
<body>
   <p class="doc">One attribute: doc</p>
   <p class="doc test">Two Attributes: doc test</p>
   <p class="test">One Attribute: test</p>
</body>
```

Currently, we can do this

```
>>> from lxml import etree
>>> root = etree.fromstring("""<body>
...    <p class="doc">One attribute</p>
...    <p class="doc test">Two Attributes: doc test</p>
...    <p class="doc2 test">Two Attributes: doc2 test</p>
... </body>
... """)
>>> elts = root.xpath("//p[@class='doc']")
>>> elts, etree.tostring(elts[0])
([<Element p at 0x102670900>], b'<p class="doc">One attribute</p>\n   ')
```


One way of extracting the list of 2 elements which contains the attribute doc with XPath is:


```
>>> root.xpath("//p[contains(@class, 'doc')]")
[<Element p at 0x1026708c0>, <Element p at 0x102670780>]
>>> [etree.tostring(elt) for elt in root.xpath("//p[contains(@class, 'doc')]")]
[b'<p class="doc">One attribute: doc</p>\n   ', b'<p class="doc test">Two Attributes: doc test</p>\n   ']
```


There is no easy way to extract all elements containing a "doc" value in a multi-values attribute in python 3.10 with xml.etree, which is quite common in html. 


```
>>> import xml.etree.ElementTree as ET
>>> root = ET.fromstring("""<body>
...    <p class="doc">One attribute: doc</p>
...    <p class="doc test">Two Attributes: doc test</p>
...    <p class="test">One Attribute: test</p>
... </body>"""
... )
>>> root.xpath("//p[contains(@class, 'doc')]")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'xpath'
```
History
Date User Action Args
2020-10-21 02:54:16karlcowsetrecipients: + karlcow
2020-10-21 02:54:16karlcowsetmessageid: <1603248856.3.0.186912878202.issue42104@roundup.psfhosted.org>
2020-10-21 02:54:16karlcowlinkissue42104 messages
2020-10-21 02:54:16karlcowcreate