xml.sax.parser() doesn't terminate when given a filename
Components: XML Versions: Python 3.0
Created on 2008-03-28 10:15 by mark

msg64625 - (view) Author: Mark Summerfield (mark) * Date: 2008-03-28 10:15
The tiny program at the end of this message runs under Python 2.5 &
30a3. Under 2 it gives the following output:

: python test.xml
('+', u'document')
('+', u'outer')
('+', u'inner')
('-', u'inner')
('-', u'outer')
('-', u'document')

Under 3 it does not terminate:
: python3 test.xml
+ document
+ outer
+ inner
- inner
- outer
- document
Traceback (most recent call last):
  File "", line 19, in <module>
  File "/home/mark/opt/python30a3/lib/python3.0/xml/sax/",
line 107, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/home/mark/opt/python30a3/lib/python3.0/xml/sax/",
line 124, in parse
    buffer =
  File "/home/mark/opt/python30a3/lib/python3.0/", line 774, in read
    current =

The xml.sax.parser() function seems to work fine if you give it an open
file object and close the file after the call. But the documentation
says you can give it a filename, but if you do that the parser does not
terminate in Python 3 although it works fine in Python 2.

import sys
import xml.sax
BUG = True
class SaxHandler(xml.sax.handler.ContentHandler):
    def startElement(self, name, attributes):
        print("+", name)
    def endElement(self, name):
        print("-", name)
handler = SaxHandler()
parser = xml.sax.make_parser()
if BUG:
    fh = open(sys.argv[1], encoding="utf8")
# end of

Here is the test file:

<?xml version="1.0" encoding="UTF-8"?>
msg64626 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-03-28 10:31
I had to disable three unit tests in test_sax. We didn't noticed the
problem before because the tests weren't actually run. The three tests
are marked clearly with XXX and FIXME.
msg72102 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2008-08-28 18:10
ISTM that this release blocker can be solved by changing line 122 from:
        while buffer != "":
        while buffer != b"":
msg72103 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-08-28 18:34
I've a better idea:

    while buffer: 

It's faster and works for both empty bytes and str.

The patch fixes the issue and re-enables three unit tests.
msg72253 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-09-01 14:34
The patch looks great. (I love enabling disabled tests!)
msg72288 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2008-09-01 20:00
Looks like this is a duplicate of issue3590, so this patch fixes two
release blockers ;)
msg72461 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2008-09-04 02:20
Benjamin will commit this.
msg72463 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-09-04 02:23
Applied in r66203.
