Message 297512 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	martin.panter
Recipients	Nam.Nguyen, martin.panter, serhiy.storchaka, vstinner
Date	2017-07-02.11:44:04
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1498995844.64.0.369462620102.issue30713@psf.upfronthosting.co.za>
In-reply-to

Content
It might help if you explained why you want to make these changes. Otherwise I have to guess. Is a compromise between strictly rejecting all non-URL characters (not just control characters), versus leaving it up to user applications to validate their URLs? I guess it could partially prevent some newline injection problems like Issue 29606 (FTP) and Issue 30458 (HTTP). But how do we know it closes more security holes than it opens? I don’t understand the focus on these three functions. They are undocumented and more-or-less deprecated (Issue 27485). Why not focus on the “urlsplit” and “urlparse” functions first? Some of the changes seem to go too far, e.g. in the splithost("//hostname/u\nrl") test case, the hostname is fine, but it is not recognized. This would partially conflict the patch in Issue 13359, with proposes to percent-encode newlines after passing through “splithost”. And it would make the URL look like a relative URL, which is a potential security hole and reminds me of the open redirect bug report (Issue 23505).

It might help if you explained why you want to make these changes. Otherwise I have to guess. Is a compromise between strictly rejecting all non-URL characters (not just control characters), versus leaving it up to user applications to validate their URLs?

I guess it could partially prevent some newline injection problems like Issue 29606 (FTP) and Issue 30458 (HTTP). But how do we know it closes more security holes than it opens?

I don’t understand the focus on these three functions. They are undocumented and more-or-less deprecated (Issue 27485). Why not focus on the “urlsplit” and “urlparse” functions first?

Some of the changes seem to go too far, e.g. in the splithost("//hostname/u\nrl") test case, the hostname is fine, but it is not recognized. This would partially conflict the patch in Issue 13359, with proposes to percent-encode newlines after passing through “splithost”. And it would make the URL look like a relative URL, which is a potential security hole and reminds me of the open redirect bug report (Issue 23505).

History
Date	User	Action	Args
2017-07-02 11:44:04	martin.panter	set	recipients: + martin.panter, vstinner, Nam.Nguyen, serhiy.storchaka
2017-07-02 11:44:04	martin.panter	set	messageid: <1498995844.64.0.369462620102.issue30713@psf.upfronthosting.co.za>
2017-07-02 11:44:04	martin.panter	link	issue30713 messages
2017-07-02 11:44:04	martin.panter	create