Message252150
the example you give for <li> is a different case.
<img>, <link> are void elements which are allowed to have no close tag;
<li> without </li> is a browser implementation detail, most browser
autocompletes </li>.
Without the parser calls the handle_endtag(), the client code which uses
HTMLParser won't be able to know whether the a traversal is finished.
Do you have a strong reason why we should include the knowledge of void
elements into the HTMLParser at this line?
https://github.com/python/cpython/blob/bdfb14c688b873567d179881fc5bb67363a6074c/Lib/html/parser.py#L341
if end.endswith('/>') or (end.endswith('>') and tag in VOID_ELEMENTS)
On Wed, Sep 30, 2015 at 7:05 PM, Martin Panter <report@bugs.python.org>
wrote:
>
> Martin Panter added the comment:
>
> My thinking is that the knowledge that <img> does not have a closing tag
> is at a higher level than the current HTMLParser class. It is similar to
> knowing where the following HTML implicitly closes the <li> elements:
>
> <ul><li>Item A<li>Item B</ul>
>
> In both cases I would not expect the HTMLParser to report “virtual” empty
> or closing tags. I don’t think it should report an empty <img/> or closing
> </img> tag just because that is easy to do, because it would be
> inconsistent with other implied HTML tags. But maybe see what other people
> say.
>
> I don’t know your particular use case, but I would suggest if you need to
> parse non-XML HTML <img> tags, use the handle_starttag() method and don’t
> rely on the end tag :)
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue25258>
> _______________________________________
> |
|
Date |
User |
Action |
Args |
2015-10-02 19:18:01 | Chenyun Yang | set | recipients:
+ Chenyun Yang, ezio.melotti, martin.panter, josh.r, xiang.zhang |
2015-10-02 19:18:01 | Chenyun Yang | link | issue25258 messages |
2015-10-02 19:18:00 | Chenyun Yang | create | |
|