Message 103430 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	r.david.murray
Recipients	Keegan.Carruthers-Smith, benjamin.peterson, eric.araujo, jjlee, ndim, orsenthil, pitrou, r.david.murray, sergiomb2, tlocke
Date	2010-04-17.20:44:50
SpamBayes Score	6.8659983e-10
Marked as misclassified	No
Message-id	<1271537093.15.0.408520449645.issue2987@psf.upfronthosting.co.za>
In-reply-to

Content
I don't know how deep you want to get into detecting invalid URIs, but with the new patch this one causes a parsing error that is probably worth dealing with: http://abc[xyz]jkl Maybe a reasonable set of checks would be (in hostname) that if the part of the netloc after the @ contains a ']' or a '[', then it must start with a [ and either end with a ] or contain a ']:'. I can also mess up your new checks with something like this: http://foo[bar@baz] or even: http://foo[bar@baz:33] although those don't fail, they just faithfully produce the nonsensical results implicit in the invalid urls. I think the above check logic in hostname would catch them, but it wouldn't catch this one: http://foo[bar@[bar]:33] That may be OK, though, since as you noted earlier we aren't doing full URI validation. Oh, and I notice that your test only covers the 'fast' path code, it doesn't exercise the general URI logic. (Sorry I didn't review this issue earlier.)

I don't know how deep you want to get into detecting invalid URIs, but with the new patch this one causes a parsing error that is probably worth dealing with:

  http://abc[xyz]jkl

Maybe a reasonable set of checks would be (in hostname) that if the part of the netloc after the @ contains a ']' or a '[', then it must start with a [ and either end with a ] or contain a ']:'.

I can also mess up your new checks with something like this:

  http://foo[bar@baz]

or even:

  http://foo[bar@baz:33]

although those don't fail, they just faithfully produce the nonsensical results implicit in the invalid urls.  I think the above check logic in hostname would catch them, but it wouldn't catch this one:

  http://foo[bar@[bar]:33]

That may be OK, though, since as you noted earlier we aren't doing full URI validation.

Oh, and I notice that your test only covers the 'fast' path code, it doesn't exercise the general URI logic.

(Sorry I didn't review this issue earlier.)

History
Date	User	Action	Args
2010-04-17 20:44:53	r.david.murray	set	recipients: + r.david.murray, jjlee, orsenthil, pitrou, benjamin.peterson, ndim, eric.araujo, sergiomb2, tlocke, Keegan.Carruthers-Smith
2010-04-17 20:44:53	r.david.murray	set	messageid: <1271537093.15.0.408520449645.issue2987@psf.upfronthosting.co.za>
2010-04-17 20:44:51	r.david.murray	link	issue2987 messages
2010-04-17 20:44:50	r.david.murray	create