Issue1063229
Created on 2004-11-09 16:54 by ezust, last changed 2009-04-22 16:03 by ajaksu2.
|
msg23075 - (view) |
Author: Alan Ezust (ezust) |
Date: 2004-11-09 16:54 |
|
When SGMLParser encounters a badly formed webpage, it
throws sgmllib.SGMLParseError with a cryptic message:
[bin] > python sgmlparsertest.py
Pythonlib's error message: expected name token
I think it should give the line and offset, and maybe
even the text it had problems with, in the args of the
exception. And print it out in the message.
My extra information: error at line 1 offset 2
<head>
^
[bin]
I tried to print it out by using parser.getpos() but it
returns values which do not correspond to the error.
How do I determine this at runtime?
testcase that reproduces this problem attached.
|
|
msg23076 - (view) |
Author: Alan Ezust (ezust) |
Date: 2004-11-09 16:55 |
|
Logged In: YES
user_id=935841
import sgmllib, urllib, urlparse
from sgmllib import SGMLParser
if __name__ == "__main__":
url = "http://www.cs.uvic.ca/~gshoja/"
parser = SGMLParser()
data = urllib.urlopen(url).read()
try:
parser.feed(data)
except sgmllib.SGMLParseError, ex:
print "Pythonlib's error message: " + str(ex)
line, offset = parser.getpos()
lines = parser.rawdata.split("\n")
print "My extra information: error at line %d offset
%d" % parser.getpos()
print lines[line]
print "%*s" % (offset, "^")
parser = None
|
|
msg86303 - (view) |
Author: Daniel Diniz (ajaksu2) |
Date: 2009-04-22 16:03 |
|
Closing, the message does currently include the problematic text. The
output in both 2.5 and trunk is:
Pythonlib's error message: expected name token at '<!<img src="image/at'
|
|
| Date |
User |
Action |
Args |
| 2009-04-22 16:03:37 | ajaksu2 | set | status: open -> closed
nosy:
+ ajaksu2 messages:
+ msg86303
resolution: out of date stage: test needed -> committed/rejected |
| 2009-02-14 21:57:43 | ajaksu2 | set | stage: test needed type: feature request versions:
+ Python 2.7, - Python 2.3 |
| 2004-11-09 16:54:37 | ezust | create | |
|