msg50009 - (view) |
Author: Nikolai Grigoriev (ngrig) |
Date: 2006-04-14 20:21 |
This is a patch to bug #1470540. It enables
xml.sax.saxutils.XMLGenerator to work correctly with
UTF-16 (and other encodings not derived from US-ASCII).
The proposed changes are as follows:
- in XMLGenerator.__init__(), create a StreamWriter
instead of a plain stream;
- in XMLGenerator._write(), convert everything to
Unicode before writing;
- in XMLGenerator.endDocument(), flush the StreamWriter.
The patch is applicable to xml/sax/saxutils.py in the
stable release (2.4.3), as well as to
xmlcore/sax/saxutils.py in the current release (2.5).
The smoke test is attached to the bug description in
the Bug Manager.
Regards,
Nikolai Grigoriev
|
msg66684 - (view) |
Author: Georg Brandl (georg.brandl) * |
Date: 2008-05-11 22:03 |
Won't this present backwards-compatibility problems if non-ASCII str
content is written?
|
msg114654 - (view) |
Author: Mark Lawrence (BreamoreBoy) * |
Date: 2010-08-22 09:30 |
The are no unit test or doc changes with the patch. Can anyone answer Georg's question on msg66684?
|
msg161764 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-05-28 10:43 |
See also issue1767933.
Instead of codecs.StreamWriter better to use io.TextIOWrapper, because the first is slower and has numerous flaws.
|
msg161767 - (view) |
Author: Walter Dörwald (doerwalter) * |
Date: 2012-05-28 11:07 |
An alternative would be to use an incremental encoder instead of a StreamWriter. (Which is what TextIOWrapper does internally).
|
msg161933 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-05-30 07:57 |
Oh, I see XMLGenerator completely outdated. It even has not been ported to Python 3. See function _write:
def _write(self, text):
if isinstance(text, str):
self._out.write(text)
else:
self._out.write(text.encode(self._encoding, _error_handling))
In Python 2 there was a choice between bytes and unicode strings. But in Python 3 encoding never happens.
XMLGenerator does not distinguish between binary and text streams.
Here is a patch that fixes the work of XMLGenerator in Python 3. Unfortunately, it is impossible to avoid the loss of backward compatibility. I tried to keep the code to work for the most common cases, but some code which "worked" before may break (including I had to correct some tests).
|
msg162851 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-06-15 07:20 |
The patch updated to reflect Martin's comments. I hope the old behavior now preserved in the most used in practice cases. Tests converted to work with bytes instead of strings.
|
msg163740 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-06-24 07:20 |
It would be nice to fix this bug before forking of the 3.3.0b1 release clone.
|
msg165509 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-07-15 07:08 |
Here is updated patch with more careful handling of closing (as for issue1767933) and added comments.
|
msg172205 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-10-06 15:10 |
Ping.
|
msg175472 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-11-12 20:44 |
If nobody has any objections, why not apply this patch?
|
msg178326 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-12-27 20:45 |
If no one objects I will commit this next year.
|
msg178369 - (view) |
Author: Georg Brandl (georg.brandl) * |
Date: 2012-12-28 07:26 |
I'd like Antoine to have a look at all that io stuff. It looks quite bloated.
In your except clause, you're not calling self._close.
|
msg179942 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2013-01-14 13:35 |
Patch updated. Fixed an error which Georg have found. Restored testing XMLGenerator with StringIO as Antoine pointed. Now XMLGenerator tested for StringIO, BytesIO and an user writer. Added tests for encoding.
|
msg180297 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2013-01-20 15:32 |
Patch updated. Now I get rid of __del__ to prevent hanging on reference cicles as Antoine suggested on IRC. Added test for check that XMLGenerator doesn't close the file passed as argument.
|
msg181797 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2013-02-10 12:38 |
New changeset 010b455de0e0 by Serhiy Storchaka in branch '2.7':
Issue #1470548: XMLGenerator now works with UTF-16 and UTF-32 encodings.
http://hg.python.org/cpython/rev/010b455de0e0
New changeset 66f92f76b2ce by Serhiy Storchaka in branch '3.2':
Issue #1470548: XMLGenerator now works with binary output streams.
http://hg.python.org/cpython/rev/66f92f76b2ce
New changeset 03b878d636cf by Serhiy Storchaka in branch '3.3':
Issue #1470548: XMLGenerator now works with binary output streams.
http://hg.python.org/cpython/rev/03b878d636cf
New changeset 12d75ca12ae7 by Serhiy Storchaka in branch 'default':
Issue #1470548: XMLGenerator now works with binary output streams.
http://hg.python.org/cpython/rev/12d75ca12ae7
|
msg182819 - (view) |
Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * |
Date: 2013-02-23 20:50 |
The change in 2.7 branch breaks some software, including a test of Django (produce_xml_fragment from https://github.com/django/django/blob/1.4.5/tests/regressiontests/test_utils/tests.py).
The problem seems to not occur with Python 3.2, 3.3 and 3.4.
Before 010b455de0e0:
>>> from StringIO import StringIO
>>> from xml.sax.saxutils import XMLGenerator
>>> stream = StringIO()
>>> xml = XMLGenerator(stream, encoding='utf-8')
>>> xml.startElement("foo", {"aaa": "1.0", "bbb": "2.0"})
>>> xml.characters("Hello")
>>> xml.endElement("foo")
>>> xml.startElement("bar", {"ccc": "3.0", "ddd": "4.0"})
>>> xml.endElement("bar")
>>> stream.getvalue()
'<foo aaa="1.0" bbb="2.0">Hello</foo><bar ccc="3.0" ddd="4.0"></bar>'
>>>
After 010b455de0e0:
>>> from StringIO import StringIO
>>> from xml.sax.saxutils import XMLGenerator
>>> stream = StringIO()
>>> xml = XMLGenerator(stream, encoding='utf-8')
>>> xml.startElement("foo", {"aaa": "1.0", "bbb": "2.0"})
>>> xml.characters("Hello")
>>> xml.endElement("foo")
>>> xml.startElement("bar", {"ccc": "3.0", "ddd": "4.0"})
>>> xml.endElement("bar")
>>> stream.getvalue()
''
>>>
|
msg182861 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2013-02-24 09:08 |
Thank you for report. Here is a patch which fixes this bug.
|
msg182892 - (view) |
Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * |
Date: 2013-02-24 20:52 |
This patch works for me.
|
msg182930 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2013-02-25 11:32 |
New changeset d707e3345a74 by Serhiy Storchaka in branch '2.7':
Issue #1470548: Do not buffer XMLGenerator output.
http://hg.python.org/cpython/rev/d707e3345a74
|
msg182931 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2013-02-25 11:49 |
New changeset 1c03e499cdc2 by Serhiy Storchaka in branch '3.2':
Issue #1470548: Add test for fragment producing with XMLGenerator.
http://hg.python.org/cpython/rev/1c03e499cdc2
New changeset 5a4b3094903f by Serhiy Storchaka in branch '3.3':
Issue #1470548: Add test for fragment producing with XMLGenerator.
http://hg.python.org/cpython/rev/5a4b3094903f
New changeset 810d70fb17a2 by Serhiy Storchaka in branch 'default':
Issue #1470548: Add test for fragment producing with XMLGenerator.
http://hg.python.org/cpython/rev/810d70fb17a2
|
msg185644 - (view) |
Author: Sebastian Ortiz Vasquez (neoecos) |
Date: 2013-03-31 19:33 |
I have been working with this in order to generate an RSS feed using web2py.
I found, XMLGenerator method does not validate if is an unicode or string type, and it does not encode accord the encoding parameter of the XMLGenerator.
I added changed the method to verify if is an unicode object or try to convert to it using the desired encoding.
Recall that the _write UnbufferedTextIOWrapper receives an unicode object as parameter.
def characters(self, content):
if isinstance(content, unicode):
self._write(escape(content))
else:
self._write(escape(unicode(content,self._encoding)))
|
msg185682 - (view) |
Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * |
Date: 2013-03-31 21:51 |
Sebastian Ortiz Vasquez: Please file a new issue and attach a patch (in unified format) instead of a whole Python module.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:16 | admin | set | github: 43215 |
2013-03-31 22:03:43 | Arfrever | set | versions:
+ Python 3.2, Python 3.3, Python 3.4 |
2013-03-31 21:51:15 | Arfrever | set | messages:
+ msg185682 title: Bugfix for #1470540 (XMLGenerator cannot output UTF-16 or UTF-8) -> xml.sax.saxutils.XMLGenerator cannot output UTF-16 |
2013-03-31 19:33:14 | neoecos | set | files:
+ saxutils.py
nosy:
+ neoecos versions:
- Python 3.2, Python 3.3, Python 3.4 messages:
+ msg185644
title: Bugfix for #1470540 (XMLGenerator cannot output UTF-16) -> Bugfix for #1470540 (XMLGenerator cannot output UTF-16 or UTF-8) |
2013-02-25 11:50:36 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: resolved |
2013-02-25 11:49:19 | python-dev | set | messages:
+ msg182931 |
2013-02-25 11:32:14 | python-dev | set | messages:
+ msg182930 |
2013-02-24 20:52:51 | Arfrever | set | messages:
+ msg182892 |
2013-02-24 09:08:15 | serhiy.storchaka | set | files:
+ XMLGenerator_fragment-2.7.patch
messages:
+ msg182861 |
2013-02-23 20:50:30 | Arfrever | set | status: closed -> open priority: normal -> release blocker
nosy:
+ Arfrever, benjamin.peterson, larry messages:
+ msg182819 resolution: fixed -> (no value) stage: resolved -> (no value) |
2013-02-10 15:23:06 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2013-02-10 12:38:00 | python-dev | set | nosy:
+ python-dev messages:
+ msg181797
|
2013-01-20 15:32:51 | serhiy.storchaka | set | files:
+ XMLGenerator-5.patch
messages:
+ msg180297 |
2013-01-14 13:36:14 | serhiy.storchaka | set | stage: needs patch -> patch review |
2013-01-14 13:35:33 | serhiy.storchaka | set | keywords:
- easy files:
+ XMLGenerator-4.patch messages:
+ msg179942
|
2012-12-30 18:40:40 | serhiy.storchaka | set | stage: patch review -> needs patch |
2012-12-28 07:26:14 | georg.brandl | set | nosy:
+ pitrou messages:
+ msg178369
|
2012-12-27 20:47:56 | serhiy.storchaka | set | assignee: serhiy.storchaka |
2012-12-27 20:45:56 | serhiy.storchaka | set | messages:
+ msg178326 |
2012-11-12 20:44:06 | serhiy.storchaka | set | messages:
+ msg175472 |
2012-10-24 09:02:24 | serhiy.storchaka | set | stage: patch review |
2012-10-20 20:09:40 | serhiy.storchaka | set | keywords:
+ needs review stage: test needed -> (no value) versions:
+ Python 3.4, - Python 3.1 |
2012-10-06 15:10:51 | serhiy.storchaka | set | messages:
+ msg172205 |
2012-08-05 11:14:07 | serhiy.storchaka | link | issue4997 superseder |
2012-07-20 06:58:46 | eli.bendersky | set | nosy:
- eli.bendersky
|
2012-07-15 07:08:12 | serhiy.storchaka | set | files:
+ XMLGenerator-3.patch nosy:
+ eli.bendersky messages:
+ msg165509
|
2012-06-24 07:20:37 | serhiy.storchaka | set | messages:
+ msg163740 |
2012-06-15 07:20:50 | serhiy.storchaka | set | files:
+ XMLGenerator-2.patch
messages:
+ msg162851 |
2012-05-30 07:58:37 | serhiy.storchaka | set | nosy:
+ loewis
|
2012-05-30 07:57:37 | serhiy.storchaka | set | files:
+ XMLGenerator.patch
messages:
+ msg161933 |
2012-05-28 11:07:58 | doerwalter | set | nosy:
+ doerwalter messages:
+ msg161767
|
2012-05-28 10:43:25 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka
messages:
+ msg161764 versions:
+ Python 3.3 |
2010-08-22 09:30:57 | BreamoreBoy | set | nosy:
+ BreamoreBoy
messages:
+ msg114654 versions:
+ Python 3.1, Python 2.7, Python 3.2, - Python 2.6 |
2009-04-05 13:45:12 | georg.brandl | link | issue1470540 superseder |
2009-04-05 13:45:12 | georg.brandl | unlink | issue1470540 dependencies |
2009-03-21 02:02:41 | ajaksu2 | set | stage: test needed type: behavior versions:
+ Python 2.6, - Python 2.5 |
2009-03-21 02:02:11 | ajaksu2 | link | issue1470540 dependencies |
2008-05-11 22:03:08 | georg.brandl | set | nosy:
+ georg.brandl messages:
+ msg66684 |
2008-01-21 13:57:10 | akuchling | set | keywords:
+ easy |
2006-04-14 20:21:23 | ngrig | create | |