This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author moriyoshi
Recipients moriyoshi
Date 2009-10-15.06:21:28
SpamBayes Score 0.0022863676
Marked as misclassified No
Message-id <1255587690.83.0.0369111720877.issue7139@psf.upfronthosting.co.za>
In-reply-to
Content
ElementTree doesn't correctly serialize end-of-line characters (#xa, 
#xd) in attribute values.  Since bare end-of-line characters are 
converted to #x20 by the parser according to the specification [1], such 
characters that are represented as character references in the original 
document must be serialized in the same form.

[1] http://www.w3.org/TR/xml11/#AVNormalize   

### sample code

from xml.etree.ElementTree import ElementTree
from cStringIO import StringIO

# builder = ElementTree(file=StringIO("<foo>\x0d</foo>"))
# out = StringIO()
# builder.write(out)
# print out.getvalue()

out = StringIO()
ElementTree(file=StringIO(
'''<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE foo [
<!ELEMENT foo (#PCDATA)>
<!ATTLIST foo attr CDATA "">
]>
<foo attr="   test
&#13;test&#32; test&#10;a  ">&#10;</foo>
''')).write(out)
# should be "<foo attr="   test &#13;test  test&#10;a  ">\x0a</foo>
print out.getvalue()

out = StringIO()
ElementTree(file=StringIO(
'''<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE foo [
<!ELEMENT foo (#PCDATA)>
<!ATTLIST foo attr NMTOKENS "">
]>
<foo attr="   test
&#13;test&#32; test&#10;a  ">&#10;</foo>
''')).write(out)
# should be "<foo attr="test &#13;test test&#10;a">\x0a</foo>
print out.getvalue()
History
Date User Action Args
2009-10-15 06:21:30moriyoshisetrecipients: + moriyoshi
2009-10-15 06:21:30moriyoshisetmessageid: <1255587690.83.0.0369111720877.issue7139@psf.upfronthosting.co.za>
2009-10-15 06:21:29moriyoshilinkissue7139 messages
2009-10-15 06:21:28moriyoshicreate