This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ezio.melotti
Recipients devon, effbot, ezio.melotti, moriyoshi
Date 2009-11-02.22:06:33
SpamBayes Score 0.00011491134
Marked as misclassified No
Message-id <1257199596.62.0.384083201988.issue7139@psf.upfronthosting.co.za>
In-reply-to
Content
If I understood correctly, the correct behavior while reading is:
  * literal newlines (\n or \r) and tabs (\t) should be collapsed and
converted to a space
  * newlines (&#xA; or &#xD;) and tabs (&#x9;) as entities should be
converted to the literal equivalents (\n, \r and \t)

(See http://www.w3.org/TR/2000/WD-xml-c14n-20000119.html#charescaping)

This should be ok in both xml.minidom and etree.


Instead, while writing, if literal newlines and tabs are written as they
are (\n, \r and \t), they can't be read during the parsing phase because
they are collapsed and converted to a space. They should therefore be
converted to entities (&#xA;, &#xD; and &#x9;) automatically, but this
could be incompatible with the current behavior (i.e. \n, \r or \t that
now are written and collapsed as a space during the parsing will then
become significant).

Moriyoshi, can you confirm that what I said is correct and the problem
is similar to the one described in #5752?
I also closed #6492 as duplicate of this.
History
Date User Action Args
2009-11-02 22:06:36ezio.melottisetrecipients: + ezio.melotti, effbot, devon, moriyoshi
2009-11-02 22:06:36ezio.melottisetmessageid: <1257199596.62.0.384083201988.issue7139@psf.upfronthosting.co.za>
2009-11-02 22:06:34ezio.melottilinkissue7139 messages
2009-11-02 22:06:33ezio.melotticreate