This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author martin.panter
Recipients chris.jerdonek, martin.panter, xtreak
Date 2018-07-30.13:10:24
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1532956224.82.0.56676864532.issue34276@psf.upfronthosting.co.za>
In-reply-to
Content
This may be a very old regression (from 2002) caused by Issue 591713 and Mercurial rev. 554f975073a0. The original check for the double slash, added in 0d6bd391acd8, “escapes” a path beginning with a double slash by prefixing it with two more slashes (empty “netloc”). This should round-trip Chris’s problem URLs.

I think the logic in “urlsplit” should always add the extra double slash for the netloc, regardless of path, at least if a scheme is present and it is registered in “uses_netloc”. This should fix Chris’s instance of the bug, since “file:” is registered. There is already a patch in Issue 1722348 which should do this (although it includes other changes as well).

The double slash should also be escaped if no scheme is present. (The empty scheme string is already in “uses_netloc”.) This might satisfy Issue 23505.

IMO it would be better to do the escaping by default, for all schemes unknown to “urllib”, and to blacklist specific schemes like “mailto:” instead. But that would be out of scope for a bug fix.
History
Date User Action Args
2018-07-30 13:10:24martin.pantersetrecipients: + martin.panter, chris.jerdonek, xtreak
2018-07-30 13:10:24martin.pantersetmessageid: <1532956224.82.0.56676864532.issue34276@psf.upfronthosting.co.za>
2018-07-30 13:10:24martin.panterlinkissue34276 messages
2018-07-30 13:10:24martin.pantercreate