classification
Title: HTMLParser.py - more robust SCRIPT tag parsing
Type: behavior Stage: committed/rejected
Components: Library (Lib) Versions: Python 3.3, Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ezio.melotti Nosy List: Hunanyan, Matt.Basta, eric.araujo, ezio.melotti, fantoozler, friday, georg.brandl, gsf, momat, orsenthil, pitrou, python-dev, r.david.murray, yotam
Priority: normal Keywords: needs review, patch

Created on 2003-01-19 14:07 by fantoozler, last changed 2011-11-01 12:18 by ezio.melotti. This issue is now closed.

Files
File name Uploaded Description Edit
patch-02.txt fantoozler, 2003-01-19 14:09
patch-test-cdata.txt georg.brandl, 2008-07-20 11:32 test suite update
minify.py cpalmer, 2008-12-02 02:06
lt-in-script-example.tgz yotam, 2010-09-30 21:50 Example of HTMLParser failure
HTMLParser.diff yotam, 2010-09-30 21:52 Patch with suggested fix
endtag-space.html yotam, 2011-01-02 20:46 Example of HTMLParser failure
dollar-extra.html yotam, 2011-01-02 20:48 Example of HTMLParser failure
ltscr-endtag-dollarext.diff yotam, 2011-01-02 20:50 Patch with suggested fix
cdata_patch.diff friday, 2011-03-08 11:00
cdata_patch.diff friday, 2011-03-08 11:28
hp_fix.diff Matt.Basta, 2011-07-27 03:24 Patch containing updated version of Alexander's patch and tests review
issue670664.diff ezio.melotti, 2011-10-30 11:40
Messages (39)
msg42474 - (view) Author: j paulson (fantoozler) Date: 2003-01-19 14:07
http://www.ebay.com contains a script element of the form

<SCRIPT>
...
   vbscript += "</SCR"+"IPT> \n";
...
</SCRIPT>

which is not enclosed in "<!-- ... -->" comments.  The parser 
choked on that line, indicating it was a mal-formed end tag.

The changes are:

  interesting_cdata is now a dict mapping start tag to
    an re matching the end tag, a "<--" or \Z

  HTMLParser.set_cdata_mode takes an extra argument, 
    the start tag
msg42475 - (view) Author: j paulson (fantoozler) Date: 2003-01-25 03:58
Logged In: YES 
user_id=690612

Found regression test, used it, found error, fixed it.
msg42476 - (view) Author: Fred L. Drake, Jr. (fdrake) (Python committer) Date: 2003-01-28 22:24
Logged In: YES 
user_id=3066

From python-dev:

John Paulson wrote:
>     [...]  A side-effect of this is that
>     any "<!--" .. "-->" within a script/style will
>     be parsed as a comment.  If that behavior is
>     incorrect, the regex can be modified.

Jerry Williams wrote:
Does this mean that the following won't work:

  <SCRIPT language="JavaScript">
    <!-- //
    some-javascript-code
    // -->
  </SCRIPT>

That could be a problem, since this is commonly used
to support browsers that don't understand <SCRIPT>.

See:
http://mail.python.org/pipermail/python-dev/2003-January/032482.html
msg42477 - (view) Author: j paulson (fantoozler) Date: 2003-01-28 22:35
Logged In: YES 
user_id=690612

You will get a sequence of:
  handle_starttag("script")
  handle_comment("some-javascript-code")
  handle_endtag("script")

whereas before the sequence was:
  handle_starttag("script")
  handle_data("<!-- ... some-javascript-code ... //-->")
  handle_endtag("script")
msg42478 - (view) Author: Fred L. Drake, Jr. (fdrake) (Python committer) Date: 2004-07-14 19:28
Logged In: YES 
user_id=3066

Removed older version of the patch.
msg70083 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-07-20 11:32
Adding test suite patch from #674449; closed that as a duplicate.
msg76723 - (view) Author: Chris Palmer (cpalmer) Date: 2008-12-02 02:05
Here is an additional test case. I have a super simple HTML "minifier"
that burps when given this test file:

========
$ cat test.html 
'foo <sc'+'ript>'
========

The explosion is:

========
$ ./minify.py test.html 
Warning: malformed start tag
'foo Traceback (most recent call last):
  File "./minify.py", line 84, in <module>
    m.feed(f.read())
  File "/usr/local/lib/python2.5/HTMLParser.py", line 108, in feed
    self.goahead(0)
  File "/usr/local/lib/python2.5/HTMLParser.py", line 148, in goahead
    k = self.parse_starttag(i)
  File "/usr/local/lib/python2.5/HTMLParser.py", line 226, in parse_starttag
    endpos = self.check_for_whole_start_tag(i)
  File "/usr/local/lib/python2.5/HTMLParser.py", line 302, in
check_for_whole_start_tag
    raise AssertionError("we should not get here!")
AssertionError: we should not get here!
========
msg83254 - (view) Author: Gabriel Sean Farrell (gsf) Date: 2009-03-06 20:31
Now that BeautifulSoup uses HTMLParser, more people are seeing these
errors. See
http://groups.google.com/group/beautifulsoup/msg/d5a7540620538d14 and
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=516824
msg88864 - (view) Author: Paweł Widera (momat) Date: 2009-06-04 06:33
A simple workaround for the BeautifulSoup is the following wrapper. It
sanitize the javascript code before passing it to the parser by joining
the disjoint strings, so that "</scr"+"ipt>" becomes "</script>".

def bs(input):
	pattern = re.compile('\"\+\"')
	match = lambda x: ""
	massage = copy.copy(BeautifulSoup.MARKUP_MASSAGE)
	massage.extend([(pattern, match)])
	return BeautifulSoup(input, markupMassage=massage)
msg117762 - (view) Author: Yotam Medini (yotam) Date: 2010-09-30 21:50
The HTMLParser.py fails when inside 
  <script> ... </script>
it can fooled by JavaScript with less-than '<' conditional expressions.
In the attached example:

 $ tar tvzf lt-in-script-example.tgz | cut -c24-
     796 2010-09-30 16:52 h2t.py
   23678 2010-09-30 16:39 t.html

here's what happens:

 $ python h2t.py t.html /tmp/t.txt
 HTMLParser: /home/yotam/src/wog/HTMLParser.bug/HTMLParser.py
 Traceback (most recent call last):
   File "h2t.py", line 31, in <module>
     text = html2text(f_html.read())
   File "h2t.py", line 23, in html2text
     te = TextExtractor(html)
   File "h2t.py", line 15, in __init__
     self.feed(html)
   File "/home/yotam/src/wog/HTMLParser.bug/HTMLParser.py", line 108, in feed
     self.goahead(0)
   File "/home/yotam/src/wog/HTMLParser.bug/HTMLParser.py", line 148, in goahead
     k = self.parse_starttag(i)
   File "/home/yotam/src/wog/HTMLParser.bug/HTMLParser.py", line 229, in parse_starttag
     endpos = self.check_for_whole_start_tag(i)
   File "/home/yotam/src/wog/HTMLParser.bug/HTMLParser.py", line 304, in check_for_whole_start_tag
     self.error("malformed start tag")
   File "/home/yotam/src/wog/HTMLParser.bug/HTMLParser.py", line 115, in error
     raise HTMLParseError(message, self.getpos())
 HTMLParser.HTMLParseError: malformed start tag, at line 396, column 332


I have a suggested patch 
   HTMLParser.diff
fixing this problem, soon to be attached.

-- yotam
msg117763 - (view) Author: Yotam Medini (yotam) Date: 2010-09-30 21:52
The attached suggested patch fixes the problems shown in msg117762.
msg120265 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-11-02 22:10
Would it be reasonable to add knowledge to html.parser to make it recognize script elements as CDATA and handle it correctly (that is let “<” pass)?
msg125096 - (view) Author: Yotam Medini (yotam) Date: 2011-01-02 20:50
Suggested fix for the attached cases:
  lt-in-script-example.tgz
  endtag-space.html
  dollar-extra.html
msg125154 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-01-03 04:05
If you provide some tests augumenting the currently existing tests
test_htmlparser.py  and also ensure that no existing test breaks, it
would be help better to review the patch. I do see some changes made
to the regex and parsing. So tests would definitely help.
msg130319 - (view) Author: Alexander (friday) Date: 2011-03-08 11:00
This is small patch for related bug issue9577 which actually is not related to this bug.
msg130326 - (view) Author: Alexander (friday) Date: 2011-03-08 11:28
And this patch fix the both bugs in more elegant way
msg130702 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-03-12 22:46
Thanks for the patch, however it would be better if you could get a clone of the CPython repo and make a patch against it.
The patch should also include tests.

You can check http://docs.python.org/devguide/ for more information.
msg141204 - (view) Author: Matt Basta (Matt.Basta) Date: 2011-07-27 03:24
The number of problems produced by this bug can be greatly reduced by adding a relatively small check to the parser. Currently, <script> and <style> tags call set_cdata_mode(), which sets self.interesting to HTMLParser.interesting_cdata. This is bad because it searches for ANY closing tag, rather than a closing tag which matches the opening tag.

Alexander's fix solved about half the problem, but it didn't handle ending tags as text. I've fixed this and added some tests.

This is my first patch, so if there's a better way that I could be submitting this, input would be appreciated.
msg141210 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-07-27 06:52
I left a review about your patch on rietveld, including a description of what I think it's going on there (the patch lacks some context and it's not easy to figure out how everything works there).
I also did some tests with and without the patch:

>>> from HTMLParser import HTMLParser as HP
>>> class MyHP(HP):
...   def handle_data(self, data): print 'data: %r' % data
... 
>>> myhp = MyHP()

# without the patch:
>>> myhp.feed('<script>foobar</script>')
data: 'foobar'  # this looks ok
>>> myhp.feed('<script><p>foo</p></script>')
data: '<p>foo'  # where's the </p>?
>>> myhp.feed('<script><p>foo</p><span>bar</span></script>')
data: '<p>foo' # some tags missing, 2 chunks received
data: 'bar'
>>> myhp.feed("<script><p>foo</p> '</scr'+'ipt>' <span>bar</span></script>")
data: '<p>foo'
data: " '"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/HTMLParser.py", line 108, in feed
    self.goahead(0)
  File "/usr/lib/python2.7/HTMLParser.py", line 150, in goahead
    k = self.parse_endtag(i)
  File "/usr/lib/python2.7/HTMLParser.py", line 317, in parse_endtag
    self.error("bad end tag: %r" % (rawdata[i:j],))
  File "/usr/lib/python2.7/HTMLParser.py", line 115, in error
    raise HTMLParseError(message, self.getpos())
HTMLParser.HTMLParseError: bad end tag: "</scr'+'ipt>", at line 1, column 247


# with the patch:
>>> myhp.feed('<script>foobar</script>')
data: 'foobar'  # ok
>>> myhp.feed('<script><p>foo</p></script>')
data: '<p>foo' # all the content is there, but why 2 chunks?
data: '</p>'
>>> myhp.feed('<script><p>foo</p><span>bar</span></script>')
data: '<p>foo' # same as previous
data: '</p>'
data: '<span>bar'
data: '</span>'
>>> myhp.feed("<script><p>foo</p> '</scr'+'ipt>' <span>bar</span></script>")  
data: '<p>foo' # same
data: '</p>'
data: " '"
data: "</scr'+'ipt>"
data: "' <span>bar"
data: '</span>'

So my question is: is it normal that the data is passed to handle_data in chunks?
AFAIU HTML parser should see CDATA as a single chunk of bytes they don't care about, so the fact that further parsing happens on the content of script/style seems wrong to me.
If I'm reading the code correctly that's because the "interesting" regex is set to look for a closing tag ('</') -- maybe assuming that the CDATA section doesn't contain any other tag (usually true in case of <style>, often false for <script>).
Changing the regex to explicitly look for the closing tag might be better (but still fail for e.g. <script> document.write('<script>alert("foo")</script>')</script> -- but some browsers will fail with this too).
msg141232 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-07-27 15:12
Ezio wrote:
  >>> myhp.feed('<script><p>foo</p></script>')
  data: '<p>foo'  # where's the </p>?

http://www.w3.org/TR/html4/types#type-cdata says:
  Although the STYLE and SCRIPT elements use CDATA for their data
  model, for these elements, CDATA must be handled differently by user
  agents. Markup and entities must be treated as raw text and passed to
  the application as is. The first occurrence of the character sequence
  "</" (end-tag open delimiter) is treated as terminating the end of
  the element's content. In valid documents, this would be the end tag
  for the element.

So I think the example is invalid (should escape the <), and that HTMLParser is not buggy.
msg141237 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-07-27 15:44
It's not buggy, but it is also not helpful.  This kind of thing is what we introduced the 'strict' parameter for.  And indeed I believe we've fixed some of these cases thereby.  So any additional fixes should go into non-strict mode in Python3.
msg141242 - (view) Author: Matt Basta (Matt.Basta) Date: 2011-07-27 16:53
> So I think the example is invalid (should escape the <), and that HTMLParser is not buggy.

On the other hand, the HTML5 spec clearly dictates otherwise:

http://www.w3.org/TR/html5/syntax.html#cdata-rcdata-restrictions
The text in raw text and RCDATA elements must not contain any occurrences of the string "</" (U+003C LESS-THAN SIGN, U+002F SOLIDUS) followed by characters that case-insensitively match the tag name of the element followed by one of U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), U+0020 SPACE, U+003E GREATER-THAN SIGN (>), or U+002F SOLIDUS (/).


Additionally, no browsers (perhaps unless they are in quirks mode) currently obey the HTML4 variant of the rule. This is due largely in part to the need to include strings such as "</scr" + "ipt>" within a script tag itself. This behavior can be observed firsthand by loading this snippet in a browser:

<script><span></span>This should not be visible.</script>
msg141248 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-07-27 17:28
Yes, but we don't claim to support HTML5 yet.

The best way to support HTML5 is probably a topic for python-dev.
msg141260 - (view) Author: Matt Basta (Matt.Basta) Date: 2011-07-27 18:37
> Yes, but we don't claim to support HTML5 yet.

There's also no claim in the docs or the source that HTMLParser specifically adheres to HTML4, either.

Ideally, the parser should strive for parity with the functionality of major web browsers, as they are the de-facto standard for HTML parser behavior. All of the browsers on my machine, for instance, will even parse the following snippet with the behavior described in the HTML5 spec:


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">
<script><span></span>This should not be visible.</script>


Even in pre-HTML5 browsers, this is the way that HTML gets parsed. For the heck of it, I downloaded an old copy of Firefox 2.0 and ran the above snippet. The behavior is consistent.

While I would otherwise agree that keeping to the HTML4 spec is the right thing to do, this is a quirk of the spec that is not only ignored by browsers (as can be seen in FX2) and changed in a future version of the spec, but is causing problems for a good number of developers.

It could be argued that the patch is a far more elegant solution for Beautiful Soup developers than the workaround in msg88864.
msg141266 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-07-27 19:13
I thought HTLM4 conformance was documented somewhere, but I could be wrong.

HTML5, from what I understand (I haven't read the spec), is explicitly or implicitly following "what browsers really do" exactly because nobody conformed to HTML4, so arguing that "a later spec changed the rules" isn't really relevant in this case :)

We made the change the way we did (strict option) out of backward compatibility concerns, so I still think this topic needs to be discussed on python-dev.  I think the argument that python should handle what most browsers handle is a strong one (I myself would have been in favor of just making this stuff work, absent backward compatibility concerns).  The question in my mind is what's the best way to get there from here?
msg141273 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-07-27 20:03
IIRC we have been following what browsers do in other cases already.
There were also some discussions about supporting HTML5 (see e.g. #7311 and #11113) and the strict vs non-strict mode introduced in Python3.

Note that changing the way things are parsed is generally not backward-compatible, but you might argue that new behavior is useful enough to break some hackish code that was trying to workaround the limitations of HTMLParser.
msg141291 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-07-28 13:43
HTML5 being a spec that builds on HTML 4.01 and real-world ways to deal with non-compliant input, I don’t object to fixes that follow the HTML5 spec.  Regarding backward compatibility, we can break it if we decide that the behavior we’re changing was a bug.  I think it’s far more useful to support BeautifoulSoup than to retain a non-useful behavior forever.
msg141297 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-07-28 14:16
Unless someone else has picked it up, BeautifulSoup is a no longer an issue since its author has abandoned it.  That doesn't change the fact that IMO it would be nice for our library to handle input generously.
msg141345 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-07-29 12:35
I also think this is a bug that should be fixed. Not being able to parse real-world HTML is a nuisance.

I agree with Ezio's review comments about the custom regex.
msg141372 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-07-29 13:33
It sounds like the early consensus on python-dev is that html5 support is a good thing.  I'm happy with that.  I presume that means the 'strict' keyword in 3.x becomes strict-per-html5, and possibly useless :)
msg141428 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-07-30 07:09
As I said somewhere else, the only use case I can think of where the 'strict' flag is useful is validation, but AFAIK even in "strict mode" it's possible to parse non-valid documents, so I agree it's pretty useless.

Moving to HTML5 and offering something able to parse real-world HTML seems the way to go IMHO.
msg141531 - (view) Author: Alexander (friday) Date: 2011-08-01 20:07
> It sounds like the early consensus on python-dev is that html5 support is a good thing. 

Yeah... But wait another 8 years untill these guys decides that there is enough  tests and other cool stuff.
msg141532 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-01 20:20
> Yeah... But wait another 8 years untill these guys decides that
> there is enough  tests and other cool stuff.

Which guys are you talking about?
Granted, this issue has been around for a loooong time... but now that we have a patch that seems ok (and has tests), we should be able to finally push this and include it in the next feature release, hopefully.
msg141533 - (view) Author: Matt Basta (Matt.Basta) Date: 2011-08-01 20:50
Seeing as everyone seems pretty satisfied with the 2.7 version, I'd be happy to put together a patch for 3 as well.

To confirm, though, this fix is NOT going behind the strict parameter, correct?
msg146632 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-10-30 11:40
Attached a new patch with a few more tests and minor refactoring.
msg146709 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-10-31 16:35
-    def set_cdata_mode(self):
+    def set_cdata_mode(self, elem):
Looks like an incompatible behavior change.  Is it only an internal method that will never affect users’ code (even subclasses)?
msg146717 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-10-31 17:13
I think it's internal.  While it's not explicitly mentioned in the source, the method is not documented and I don't think people subclassed it.  All that it does is changing the regex used to parse the data, and if someone needs to change this, it's probably easier to just change the regex.
OTOH this method is called just in one place, so in theory it's possible to set self.cdata_elem there before the set_cdata_mode() call, but that might make the code more fragile.
msg146770 - (view) Author: Roundup Robot (python-dev) Date: 2011-11-01 12:14
New changeset 0a5eb57d5876 by Ezio Melotti in branch '2.7':
#670664: Fix HTMLParser to correctly handle the content of ``<script>...</script>`` and ``<style>...</style>``.
http://hg.python.org/cpython/rev/0a5eb57d5876

New changeset a6f2244b251f by Ezio Melotti in branch '3.2':
#670664: Fix HTMLParser to correctly handle the content of ``<script>...</script>`` and ``<style>...</style>``.
http://hg.python.org/cpython/rev/a6f2244b251f

New changeset b40752e227fa by Ezio Melotti in branch 'default':
#670664: merge with 3.2.
http://hg.python.org/cpython/rev/b40752e227fa
msg146771 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-11-01 12:18
Fixed, thanks to everyone who contributed to this over the years!
History
Date User Action Args
2012-01-04 15:02:17ezio.melottilinkissue13711 superseder
2011-11-01 12:18:10ezio.melottisetstatus: open -> closed
resolution: fixed
messages: + msg146771

stage: commit review -> committed/rejected
2011-11-01 12:14:35python-devsetnosy: + python-dev
messages: + msg146770
2011-10-31 17:13:21ezio.melottisetmessages: + msg146717
2011-10-31 16:35:06eric.araujosetmessages: + msg146709
2011-10-30 11:40:27ezio.melottisetfiles: + issue670664.diff
versions: + Python 2.7
messages: + msg146632

keywords: + needs review
stage: patch review -> commit review
2011-10-29 08:10:51ezio.melottisetassignee: ezio.melotti
2011-10-03 13:01:45fdrakesetnosy: - fdrake
2011-08-08 15:49:37cpalmersetnosy: - cpalmer
2011-08-01 20:50:30Matt.Bastasetmessages: + msg141533
2011-08-01 20:20:04pitrousetmessages: + msg141532
2011-08-01 20:07:50fridaysetmessages: + msg141531
2011-07-30 07:09:15ezio.melottisetmessages: + msg141428
2011-07-29 13:33:20r.david.murraysetmessages: + msg141372
2011-07-29 12:35:06pitrousetnosy: + pitrou
messages: + msg141345

assignee: fdrake -> (no value)
stage: patch review
2011-07-28 14:16:12r.david.murraysetmessages: + msg141297
2011-07-28 13:43:24eric.araujosetmessages: + msg141291
2011-07-27 20:03:23ezio.melottisetmessages: + msg141273
2011-07-27 19:13:09r.david.murraysetmessages: + msg141266
2011-07-27 18:37:45Matt.Bastasetmessages: + msg141260
2011-07-27 17:28:49r.david.murraysetmessages: + msg141248
2011-07-27 16:53:52Matt.Bastasetmessages: + msg141242
2011-07-27 15:44:49r.david.murraysetmessages: + msg141237
versions: - Python 2.7
2011-07-27 15:12:07eric.araujosetmessages: + msg141232
versions: + Python 3.3, - Python 3.1
2011-07-27 06:52:14ezio.melottisetmessages: + msg141210
2011-07-27 03:24:51Matt.Bastasetfiles: + hp_fix.diff
nosy: + Matt.Basta
messages: + msg141204

2011-03-12 22:46:59ezio.melottisetnosy: fdrake, georg.brandl, yotam, orsenthil, fantoozler, gsf, cpalmer, ezio.melotti, eric.araujo, r.david.murray, momat, Hunanyan, friday
messages: + msg130702
2011-03-08 11:28:24fridaysetfiles: + cdata_patch.diff
nosy: fdrake, georg.brandl, yotam, orsenthil, fantoozler, gsf, cpalmer, ezio.melotti, eric.araujo, r.david.murray, momat, Hunanyan, friday
messages: + msg130326
2011-03-08 11:00:15fridaysetfiles: + cdata_patch.diff
nosy: + friday
messages: + msg130319

2011-02-14 13:54:01r.david.murraysetnosy: + r.david.murray
2011-01-03 04:05:17orsenthilsetnosy: + orsenthil
messages: + msg125154
2011-01-02 20:50:12yotamsetfiles: + ltscr-endtag-dollarext.diff
nosy: fdrake, georg.brandl, yotam, fantoozler, gsf, cpalmer, ezio.melotti, eric.araujo, momat, Hunanyan
messages: + msg125096
2011-01-02 20:48:36yotamsetfiles: + dollar-extra.html
nosy: fdrake, georg.brandl, yotam, fantoozler, gsf, cpalmer, ezio.melotti, eric.araujo, momat, Hunanyan
2011-01-02 20:46:24yotamsetfiles: + endtag-space.html
nosy: fdrake, georg.brandl, yotam, fantoozler, gsf, cpalmer, ezio.melotti, eric.araujo, momat, Hunanyan
2010-11-02 22:10:15eric.araujosetnosy: + eric.araujo
messages: + msg120265
2010-09-30 21:52:24yotamsetfiles: + HTMLParser.diff

messages: + msg117763
2010-09-30 21:50:04yotamsetfiles: + lt-in-script-example.tgz
nosy: + yotam
messages: + msg117762

2010-08-17 22:58:36BreamoreBoysetversions: + Python 3.1
2010-08-13 11:51:15r.david.murraysetnosy: + Hunanyan
2010-08-12 22:32:32r.david.murraylinkissue9577 superseder
2010-03-01 12:32:13r.david.murraylinkissue1752919 superseder
2009-06-06 21:35:52ezio.melottisetnosy: + ezio.melotti

versions: + Python 2.7, Python 3.2, - Python 2.5
2009-06-04 06:34:08momatsetmessages: + msg88864
2009-06-04 06:01:31momatsetnosy: + momat
2009-03-06 20:31:14gsfsetnosy: + gsf
messages: + msg83254
2008-12-02 02:07:42cpalmersettype: behavior
2008-12-02 02:06:02cpalmersetfiles: + minify.py
nosy: + cpalmer
messages: + msg76723
versions: + Python 2.5, - Python 2.3
2008-07-20 11:32:05georg.brandlsetfiles: + patch-test-cdata.txt
nosy: + georg.brandl
messages: + msg70083
2003-01-19 14:07:07fantoozlercreate