Message384256
Ezio,
TL,DR: Testing in browsers and adding two tests for this issue.
Should I create a PR just for the tests?
https://github.com/python/cpython/blame/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Lib/test/test_htmlparser.py#L479-L485
A: comma without spaces
-----------------------
Tests for browsers:
data:text/html,<!doctype html><div class=bar,baz=asd>text</div>
Serializations:
* Firefox, Gecko (86.0a1 (2020-12-28) (64-bit))
* Edge, Blink (Version 89.0.752.0 (Version officielle) Canary (64 bits))
* Safari, WebKit (Release 117 (Safari 14.1, WebKit 16611.1.7.2))
Same serialization in these 3 rendering engines
<div class="bar,baz=asd">text</div>
Adding:
def test_comma_between_unquoted_attributes(self):
# bpo 41748
self._run_check('<div class=bar,baz=asd>',
[('starttag', 'div', [('class', 'bar,baz=asd')])])
❯ ./python.exe -m test -v test_htmlparser
…
test_comma_between_unquoted_attributes (test.test_htmlparser.HTMLParserTestCase) ... ok
…
Ran 47 tests in 0.168s
OK
== Tests result: SUCCESS ==
1 test OK.
Total duration: 369 ms
Tests result: SUCCESS
So this is working as expected for the first test.
B: comma with spaces
--------------------
Tests for browsers:
data:text/html,<!doctype html><div class=bar, baz=asd>text</div>
Serializations:
* Firefox, Gecko (86.0a1 (2020-12-28) (64-bit))
* Edge, Blink (Version 89.0.752.0 (Version officielle) Canary (64 bits))
* Safari, WebKit (Release 117 (Safari 14.1, WebKit 16611.1.7.2))
Same serialization in these 3 rendering engines
<div class="bar" ,baz="asd">text</div>
Adding
def test_comma_with_space_between_unquoted_attributes(self):
# bpo 41748
self._run_check('<div class=bar ,baz=asd>',
[('starttag', 'div', [
('class', 'bar'),
(',baz', 'asd')])])
❯ ./python.exe -m test -v test_htmlparser
This is failing.
======================================================================
FAIL: test_comma_with_space_between_unquoted_attributes (test.test_htmlparser.HTMLParserTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/karl/code/cpython/Lib/test/test_htmlparser.py", line 493, in test_comma_with_space_between_unquoted_attributes
self._run_check('<div class=bar ,baz=asd>',
File "/Users/karl/code/cpython/Lib/test/test_htmlparser.py", line 95, in _run_check
self.fail("received events did not match expected events" +
AssertionError: received events did not match expected events
Source:
'<div class=bar ,baz=asd>'
Expected:
[('starttag', 'div', [('class', 'bar'), (',baz', 'asd')])]
Received:
[('data', '<div class=bar ,baz=asd>')]
----------------------------------------------------------------------
I started to look into the code of parser.py which I'm not familiar (yet) with.
https://github.com/python/cpython/blob/63298930fb531ba2bb4f23bc3b915dbf1e17e9e1/Lib/html/parser.py#L42-L52
Do you have a suggestion to fix it? |
|
Date |
User |
Action |
Args |
2021-01-03 08:22:14 | karlcow | set | recipients:
+ karlcow, vstinner, ezio.melotti, nowasky.jr |
2021-01-03 08:22:14 | karlcow | set | messageid: <1609662134.54.0.633908039041.issue41748@roundup.psfhosted.org> |
2021-01-03 08:22:14 | karlcow | link | issue41748 messages |
2021-01-03 08:22:14 | karlcow | create | |
|