Message 135447 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	cdqzzy
Recipients	cdqzzy, ezio.melotti, lemburg, terry.reedy, vstinner
Date	2011-05-07.11:32:46
SpamBayes Score	4.578171e-09
Marked as misclassified	No
Message-id	<1304767967.63.0.73820705744.issue12016@psf.upfronthosting.co.za>
In-reply-to

Content
I do not have documents on this subject. Though, I found that GNU iconv(1) behaves the same as my proposed behavior. My reading of the source code suggests that iconv(1) treat all encodings equally, which I think should also be true for python. As of security concerns, I do not think the change in decoding function itself would introduce any security vulnerabilities. If a security issue arises because of the proposed change, there must be improper code out side of python, which is out of python's control. That said, the proposed change is unlikely to introduce new security vulnerability, as all it does in effect is retaining a few ascii characters in the string to the output as opposed to removing. In the issue of wordpress, if we suppose that wordpress was written in python, and that the attacker was using gb2312 encoded strings instead of gbk, then my proposed change would by chance fix the issue, as the backslash would be retained when we decode the string.

I do not have documents on this subject. Though, I found that GNU iconv(1) behaves the same as my proposed behavior. My reading of the source code suggests that iconv(1) treat all encodings equally, which I think should also be true for python.

As of security concerns, I do not think the change in decoding function itself would introduce any security vulnerabilities. If a security issue arises because of the proposed change, there must be improper code out side of python, which is out of python's control. That said, the proposed change is unlikely to introduce new security vulnerability, as all it does in effect is retaining a few ascii characters in the string to the output as opposed to removing.  In the issue of wordpress, if we suppose that wordpress was written in python, and that the attacker was using gb2312 encoded strings instead of gbk, then my proposed change would by chance fix the issue, as the backslash would be retained when we decode the string.

History
Date	User	Action	Args
2011-05-07 11:32:47	cdqzzy	set	recipients: + cdqzzy, lemburg, terry.reedy, vstinner, ezio.melotti
2011-05-07 11:32:47	cdqzzy	set	messageid: <1304767967.63.0.73820705744.issue12016@psf.upfronthosting.co.za>
2011-05-07 11:32:46	cdqzzy	link	issue12016 messages
2011-05-07 11:32:46	cdqzzy	create