This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: sgmllib.parse_endtag() is not respecting quoted text
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: Michael.Brooks, ezio.melotti, iritkatriel
Priority: normal Keywords:

Created on 2010-12-01 21:21 by Michael.Brooks, last changed 2022-04-11 14:57 by admin. This issue is now closed.

File name Uploaded Description Edit Michael.Brooks, 2010-12-01 21:21
Messages (3)
msg123016 - (view) Author: Michael Brooks (Michael.Brooks) Date: 2010-12-01 21:21
In the attached example is a very simple usage of sgmllib that is trying to parse:
<input value="><a href=http://bug>link</a>">

The bug is that sgmllib is parsing this href.  Browsers on the other hand see this as the input's value.  

Also keep in mind that escaping of quote marks in HTML is not like python.  \" is not a character literal "  thus <input value="\"><a href=http://bug>link</a>"> is still quoted text and the href should not be parsed. 

Thank you
msg123017 - (view) Author: Michael Brooks (Michael.Brooks) Date: 2010-12-01 21:34
Oops, I had a misnomer in my bug report. 
<input value="\"><a href=http://bug>link</a>"> is not escaped and there for the href should be parsed in this condition but not parsed in the attached
msg391951 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-04-26 17:17
sgmllib was removed in python 3.
Date User Action Args
2022-04-11 14:57:09adminsetgithub: 54808
2021-04-26 17:17:35iritkatrielsetstatus: open -> closed

nosy: + iritkatriel
messages: + msg391951

resolution: out of date
stage: resolved
2014-11-18 15:03:14serhiy.storchakasetnosy: + ezio.melotti

components: + Library (Lib), - None
versions: + Python 2.7, - Python 2.6
2010-12-01 21:34:29Michael.Brookssetmessages: + msg123017
2010-12-01 21:21:45Michael.Brookscreate