Author martin.panter
Recipients Björn.Lindqvist, martin.panter, orsenthil, r.david.murray
Date 2016-07-31.02:37:10
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1469932632.75.0.192070134574.issue27657@psf.upfronthosting.co.za>
In-reply-to
Content
The main backward compatibility consideration would be Issue 754016, but don’t agree with the changes made, and would support reverting them. The original bug reporter wanted urlparse("1.2.3.4:80", "http") to be treated as the URL http://1.2.3.4:80, but the IP address was being parsed as a scheme, so the default “http” scheme was ignored.

The original fix (r83701) affected any URL that had a digit 0–9 immediately after the “scheme:” prefix. In such URLs, the scheme component was no longer parsed. A test case for “path:80” was added, and a demonstration of not parsing any scheme from www.cwi.nl:80/%7Eguido/Python.html was added in the documentation.

Later, the logic was altered to test if the URL looked like an integer (revision 495d12196487, Issue 11467). This restored proper parsing of clsid:85bbd92o-42a0-1o69-a2e4-08002b30309d and mailto:1337@example.org, although another URL given, javascript:123, remains misparsed. The documentation was subsequently adjusted in Issue 16932 to just demonstrate www.cwi.nl/%7Eguido/Python.html being parsed as a path.

The logic was watered down to its current form by revision 9f6b7576c08c, Issue 14072. Now it tests for a non-digit anywhere after the scheme, so that tel:+31641044153 is again parsed properly. But it was pointed out that tel:1234 remains misparsed.

What’s the next step in the watering-down process? All the attempts so far break valid URLs in favour of special-casing inputs that are not valid URLs.
History
Date User Action Args
2016-07-31 02:37:12martin.pantersetrecipients: + martin.panter, orsenthil, r.david.murray, Björn.Lindqvist
2016-07-31 02:37:12martin.pantersetmessageid: <1469932632.75.0.192070134574.issue27657@psf.upfronthosting.co.za>
2016-07-31 02:37:12martin.panterlinkissue27657 messages
2016-07-31 02:37:10martin.pantercreate