Message 62909 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	ygale
Recipients	ygale
Date	2008-02-24.14:53:28
SpamBayes Score	0.022665307
Marked as misclassified	No
Message-id	<1203864810.15.0.0554571432412.issue2174@psf.upfronthosting.co.za>
In-reply-to

Content
So I think there are two possibilities: 1. Use a special value for getSourceEnconding(), like "unicode", to indicate that this is a unicode character stream and not a byte stream. 2. Provide yet another method in the XMLReader interface: sourceIsCharacterStream(), returning a bool. There is a more drastic option: 3. Since expat doesn't support this stuff anyway, and perhaps not too many people have written parsers that do support it, dumb down the InputSource interface. Specifically, deprecate setCharacterStream(), getCharacterStream(), setEncoding() and getEncoding(), none of which are used by expat. Parsers should read the XML from the byte stream and use that to determine the encoding. That may upset some implementors of XML libraries though. They would each have to go to some trouble to provide their own proprietary and possibly incompatible mechanisms for this, if they need it. Perhaps a compromise fourth path would be to have subclasses of InputSource for the two cases of character stream and byte stream.

So I think there are two possibilities:

1. Use a special value for getSourceEnconding(),
like "unicode", to indicate that this is a
unicode character stream and not a byte stream.

2. Provide yet another method in the XMLReader
interface: sourceIsCharacterStream(), returning
a bool.

There is a more drastic option:

3. Since expat doesn't support this stuff
anyway, and perhaps not too many people
have written parsers that do support it,
dumb down the InputSource interface.

Specifically, deprecate setCharacterStream(),
getCharacterStream(), setEncoding() and
getEncoding(), none of which are used by
expat. Parsers should read the XML from
the byte stream and use that to determine
the encoding.

That may upset some implementors of XML
libraries though. They would each have to go
to some trouble to provide their own
proprietary and possibly incompatible
mechanisms for this, if they need it.

Perhaps a compromise fourth path would
be to have subclasses of InputSource for
the two cases of character stream and
byte stream.

History
Date	User	Action	Args
2008-02-24 14:53:30	ygale	set	spambayes_score: 0.0226653 -> 0.022665307 recipients: + ygale
2008-02-24 14:53:30	ygale	set	spambayes_score: 0.0226653 -> 0.0226653 messageid: <1203864810.15.0.0554571432412.issue2174@psf.upfronthosting.co.za>
2008-02-24 14:53:29	ygale	link	issue2174 messages
2008-02-24 14:53:28	ygale	create