Title: xmllib unable to parse in UTF8 format
msg266322 - (view) Author: Enrico (enrico.scame) Date: 2016-05-25 09:09
The xmllib.XMLParser seems to be unable to parse 
an XML file that contains cyrillic characters.

   File "xmllib.pyc", line 172, in feed
   File "xmllib.pyc", line 268, in goahead
   File "xmllib.pyc", line 798, in syntax_error
 Error: Syntax error at line 8: illegal character in content
msg266339 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-25 12:36
Could you please provide minimal reproducer? Minimal script and minimal data that expose the issue.
msg266344 - (view) Author: Enrico (enrico.scame) Date: 2016-05-25 13:14
I have attached This file is in python23\lib folder.

The strings in XML file are in cyrillic language.

My code:
import xmllib

class Parser(xmllib.XMLParser):
    # a simple styling engine

    def __init__(self):
        self.cursupervisore = None
        self.curdata        = ''

        self.elements = {'Superv':(self.starttag_superv, self.endtag_superv)
    def load(self, file):
        while 1:
            s = file.readline()

            if not s:

def read_plant_tree(filexml):
      c = Parser()
msg266479 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-27 06:02
See also issue222587. Seems this was the reason why the xmllib module was deprecated.

Use the xml package for parsing XML (xml.etree.ElementTree, xml.dom.minidom, xml.sax, etc).
