This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Richard.Urwin
Recipients BreamoreBoy, Richard.Urwin, amaury.forgeotdarc, bugok, effbot, flox, nnorwitz, rurwin
Date 2010-07-26.13:24:26
SpamBayes Score 0.0014398
Marked as misclassified No
Message-id <>
I can't produce an automated test, for want of time, but here is a demonstrator.

Grab the example XHTML from or use some tiny ASCII-encoded xml file. Save it as "file.xml" in the same folder as attached here.

Execute bug-test.xml

file.xml is read and then written in UTF-16. The output file is then read and dumped to stdout as a byte-stream.

1. To be correct UTF-16, the output should start with 255 254, which should never occur in the rest of the file.

2. The rest of the output (including the first line) should alternate zeros with ASCII character codes.

3. The file output.xml should be loadable in a UTF16-capable text editor (eg jEdit), be recognised as UTF-16 and be identical in terms of content to file.xml
Date User Action Args
2010-07-26 13:24:30Richard.Urwinsetrecipients: + Richard.Urwin, effbot, nnorwitz, amaury.forgeotdarc, bugok, rurwin, flox, BreamoreBoy
2010-07-26 13:24:29Richard.Urwinsetmessageid: <>
2010-07-26 13:24:27Richard.Urwinlinkissue1767933 messages
2010-07-26 13:24:26Richard.Urwincreate