classification
Title: xml.sax.saxutils.XMLGenerator should write to io.RawIOBase.
Type: Stage:
Components: Library (Lib), XML Versions: Python 3.0
process
Status: closed Resolution: duplicate
Dependencies: Superseder: xml.sax.saxutils.XMLGenerator cannot output UTF-16
View: 1470548
Assigned To: Nosy List: craigh, kawai, pitrou, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2009-01-19 10:35 by kawai, last changed 2012-10-24 09:45 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
xmlgen2.patch craigh, 2009-07-22 22:53 review
xmlgen-doc.patch craigh, 2009-07-22 22:54 review
Messages (8)
msg80155 - (view) Author: HiroakiKawai (kawai) Date: 2009-01-19 10:35
xml.sax.saxutils.XMLGenerator._write tests the argument by 
isinstance(text, str), but this is problematic in Python 3.0. 
XMLGenerator accepts encoding and the produced file is encoded by that 
encoding, i.e., the XML is a binary sequence. So IMHO, the XMLGenerator 
constructor argument should be a subclass of io.RawIOBase.
msg90823 - (view) Author: Craig Holmquist (craigh) Date: 2009-07-22 21:43
To clarify the specific problem:

- If the file object passed to XMLGenerator is opened in binary mode,
XMLGenerator raises TypeError as soon as it tries to write to it
- If the passed file object is opened in text mode, XMLGenerator writes
the prescribed encoding to the XML declaration but it actually uses the
file object's encoding when writing everything
msg90826 - (view) Author: Craig Holmquist (craigh) Date: 2009-07-22 22:02
Patch attached.  This patch doesn't actually restrict the output object
to RawIOBase (that wouldn't work well, since files opened as binary are
actually derived from BufferedIOBase).  Instead, it just assumes the
output object has a 'write' method that accepts a single bytes argument.
 Also, XMLGenerator no longer needs to check if the input is str or unicode.
msg90829 - (view) Author: Craig Holmquist (craigh) Date: 2009-07-22 22:35
Actually, that patch may not work so well either... out defaults to
sys.stdout, but that can't accept bytes.
msg90831 - (view) Author: Craig Holmquist (craigh) Date: 2009-07-22 22:53
This new patch removes the "default to stdout" behavior.
msg90832 - (view) Author: Craig Holmquist (craigh) Date: 2009-07-22 22:54
Patch for documentation.
msg91310 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-08-05 10:59
You shouldn't remove the defaulting behaviour for `out`, but use
`sys.stdout.buffer` instead.

Bonus points if you add a test so that this kind of bug doesn't go
unnoticed again.

PS: it's ironic that the default encoding here is iso-8859-1. This piece
of code is really getting old.
msg165510 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-07-15 07:38
This issue will be fixed by patch for issue1470548.
History
Date User Action Args
2012-10-24 09:45:17serhiy.storchakasetstatus: open -> closed
2012-08-05 11:14:07serhiy.storchakasetsuperseder: xml.sax.saxutils.XMLGenerator cannot output UTF-16
resolution: duplicate
2012-07-15 07:38:28serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg165510
2009-08-05 10:59:32pitrousetnosy: + pitrou
messages: + msg91310
2009-07-22 22:56:08craighsetfiles: - xmlgen.patch
2009-07-22 22:54:23craighsetfiles: + xmlgen-doc.patch

messages: + msg90832
2009-07-22 22:53:26craighsetfiles: + xmlgen2.patch

messages: + msg90831
2009-07-22 22:35:26craighsetmessages: + msg90829
2009-07-22 22:02:25craighsetfiles: + xmlgen.patch
keywords: + patch
messages: + msg90826
2009-07-22 21:43:31craighsetnosy: + craigh
messages: + msg90823
2009-01-19 10:35:57kawaicreate