This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ygale
Recipients ygale
Date 2008-02-24.14:53:28
SpamBayes Score 0.022665307
Marked as misclassified No
Message-id <1203864810.15.0.0554571432412.issue2174@psf.upfronthosting.co.za>
In-reply-to
Content
So I think there are two possibilities:

1. Use a special value for getSourceEnconding(),
like "unicode", to indicate that this is a
unicode character stream and not a byte stream.

2. Provide yet another method in the XMLReader
interface: sourceIsCharacterStream(), returning
a bool.

There is a more drastic option:

3. Since expat doesn't support this stuff
anyway, and perhaps not too many people
have written parsers that do support it,
dumb down the InputSource interface.

Specifically, deprecate setCharacterStream(),
getCharacterStream(), setEncoding() and
getEncoding(), none of which are used by
expat. Parsers should read the XML from
the byte stream and use that to determine
the encoding.

That may upset some implementors of XML
libraries though. They would each have to go
to some trouble to provide their own
proprietary and possibly incompatible
mechanisms for this, if they need it.

Perhaps a compromise fourth path would
be to have subclasses of InputSource for
the two cases of character stream and
byte stream.
History
Date User Action Args
2008-02-24 14:53:30ygalesetspambayes_score: 0.0226653 -> 0.022665307
recipients: + ygale
2008-02-24 14:53:30ygalesetspambayes_score: 0.0226653 -> 0.0226653
messageid: <1203864810.15.0.0554571432412.issue2174@psf.upfronthosting.co.za>
2008-02-24 14:53:29ygalelinkissue2174 messages
2008-02-24 14:53:28ygalecreate