Message 2422 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	sjoerd
Recipients
Date	2000-11-24.09:47:45
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
The problem here is the character reference x. xmllib is from before Python support for Unicode, so it doesn't support any characters that are not representable in 8 bits, and it only really supports iso-8859-1 (latin1), and not even the utf-8 encoding of latin1. It also doesn't do the right thing for character references outside of the ASCII range, although it'll accept characters references in the range 0 - 255 (decimal). It is too much work to fix this. I will make available a rewrite of xmllib that has full Unicode support and what's more, is a validating XML parser. The main problem with this rewrite is that it is pretty slow (it uses many, big regular expressions, and compiling those re's is a time consuming task). Mail me if you want a copy.

The problem here is the character reference x. xmllib is from before Python support for Unicode, so it doesn't support any characters that are not representable in 8 bits, and it only really supports iso-8859-1 (latin1), and not even the utf-8 encoding of latin1. It also doesn't do the right thing for character references outside of the ASCII range, although it'll accept characters references in the range 0 - 255 (decimal).

It is too much work to fix this.

I will make available a rewrite of xmllib that has full Unicode support and what's more, is a validating XML parser. The main problem with this rewrite is that it is pretty slow (it uses many, big regular expressions, and compiling those re's is a time consuming task).  Mail me if you want a copy.

History
Date	User	Action	Args
2007-08-23 13:52:12	admin	link	issue222587 messages
2007-08-23 13:52:12	admin	create