classification
Title: xml.sax.saxutils.escape doesn't escape multiple characters safely
Type: enhancement Stage:
Components: Documentation, XML Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, martin.panter, serhiy.storchaka, tylerjohnhughes, xiang.zhang
Priority: normal Keywords: patch

Created on 2016-06-30 23:28 by tylerjohnhughes, last changed 2016-07-03 14:17 by xiang.zhang.

Files
File name Uploaded Description Edit
escapetest.py tylerjohnhughes, 2016-06-30 23:28 Behavior Example
issue27429.patch xiang.zhang, 2016-07-03 12:17 review
sax_escape_doc.patch xiang.zhang, 2016-07-03 14:17 review
Messages (5)
msg269634 - (view) Author: tylerjohnhughes (tylerjohnhughes) * Date: 2016-06-30 23:28
The escape function appears to go through the list in multiple passes, replacing characters as it encounters them on each pass, rather than traversing the source string and replacing matches in the entities dict. This results in invalid escape strings if a replacement value contains one of the replacement entities. I've attached a file to reproduce the behavior.
msg269763 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-03 12:17
I think this a bug. There should be no override when escape or unescape. Upload a patch to fix this.
msg269766 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-03 12:29
The purpose of xml.sax.saxutils.escape() is escaping characters, that can't be used directly in XML: "&", "<", etc. Quotes are escaped in attributes. It shouldn't be used for replacing ";", because this character itself is used in escapes.

There is no a bug. If use this function correctly it works as expected.
msg269767 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-03 12:38
I thought of that too. But the doc doesn't tell you that you can not put any characters in the entities, so I think we should make the implementation right when some unexpected characters are passed in. If you don't like the implementation to be changed, I think at least we should declare that in the documentation.
msg269768 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-03 14:17
Put a not in escape's doc.
History
Date User Action Args
2016-07-03 14:17:00xiang.zhangsetfiles: + sax_escape_doc.patch

messages: + msg269768
2016-07-03 12:43:18serhiy.storchakasetassignee: docs@python

type: behavior -> enhancement
components: + Documentation
nosy: + docs@python
2016-07-03 12:38:36xiang.zhangsetmessages: + msg269767
2016-07-03 12:29:28serhiy.storchakasetmessages: + msg269766
2016-07-03 12:17:43xiang.zhangsetfiles: + issue27429.patch

nosy: + martin.panter, serhiy.storchaka
messages: + msg269763

keywords: + patch
2016-07-01 10:14:24xiang.zhangsetnosy: + xiang.zhang
2016-06-30 23:28:04tylerjohnhughescreate