Message 87676 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Tomalak
Recipients	Tomalak, ajaksu2, sechi_francesco
Date	2009-05-13.11:44:33
SpamBayes Score	0.0
Marked as misclassified	No
Message-id	<1242215077.11.0.489340985564.issue5752@psf.upfronthosting.co.za>
In-reply-to

Content
Francesco, > if you want to encode the newline character, > this should be done by both parseString and > setAttribute methods. Otherwise, the > behaviour is not symmetric. I believe you still don't see the issue. The behaviour is not symmetric now. You store a '\n' in an attribute value with setAttribute(), save the document to XML, load it again and out comes a space where the '\n' should have been. The point is that parseString() behaves correctly, but serializing does not. There is only one side to fix, because only one side is broken. > If you want to encode the newline in different > manner, you should develop a patch that > introduces this kind of encoding in both > parseString and setAttribute methods. It would be pointless to do the encoding in setAttribute(). The valid ways to XML-encode a '\n' character are '&#xA', '&#x0A' or '&#10'. Doing so in setAttribute() would produce doubly encoded output, like this: '&#10'. This is even more wrong. However, if parseString() encounters a '&#10' in the input, it correctly translates this to '\n' in the DOM. As I said, there is nothing to fix in parsing, this exercise is about getting minidom to actually output a ' ' where appropriate. :-)

Francesco,

> if you want to encode the newline character, 
> this should be done by both parseString and 
> setAttribute methods. Otherwise, the 
> behaviour is not symmetric.

I believe you still don't see the issue. The behaviour is not symmetric
*now*. You store a '\n' in an attribute value with setAttribute(), save
the document to XML, load it again and out comes a space where the '\n'
should have been.

The point is that parseString() behaves correctly, but serializing does
not. There is only one side to fix, because only one side is broken.

> If you want to encode the newline in different 
> manner, you should develop a patch that
> introduces this kind of encoding in both 
> parseString and setAttribute methods.

It would be pointless to do the encoding in setAttribute(). The valid
ways to XML-encode a '\n' character are '&#xA', '&#x0A' or '&#10'. Doing
so in setAttribute() would produce doubly encoded output, like this:
'&amp;#10'. This is even more wrong.

However, if parseString() encounters a '&#10' in the input, it correctly
translates this to '\n' in the DOM. As I said, there is nothing to fix
in parsing, this exercise is about getting minidom to actually *output*
a '&#10;' where appropriate. :-)

History
Date	User	Action	Args
2009-05-13 11:44:37	Tomalak	set	recipients: + Tomalak, ajaksu2, sechi_francesco
2009-05-13 11:44:37	Tomalak	set	messageid: <1242215077.11.0.489340985564.issue5752@psf.upfronthosting.co.za>
2009-05-13 11:44:34	Tomalak	link	issue5752 messages
2009-05-13 11:44:33	Tomalak	create