Message237999
There have been a few recent bug reports (Issue 23505, Issue 23636) that may be solved by the has_netloc proposal. So I am posting a patch implementing it. The changes were a bit more involved than I anticipated, but should still be usable.
I reused some of Stian’s tests, however the results are slightly different in my patch, matching the existing behaviour:
* Never sets netloc, query, fragment to None
* Always leaves hostname as None rather than ""
* Retains username, password and port components in netloc
* Converts hostname to lowercase
Unfortunately I discovered that you cannot add __slots__ to namedtuple() subclasses; see Issue 17295 and Issue 1173475. Therefore in my patch I have removed __slots__ from the SplitResult etc classes, so that those classes can gain the has_netloc etc attributes.
I chose to make the default has_netloc value based on existing urlunsplit() behaviour:
>>> empty_netloc = ""
>>> SplitResult("mailto", empty_netloc, "chris@example.com", "", "").has_netloc
False
>>> SplitResult("file", empty_netloc, "/path", "", "").has_netloc
True
I found out that the “urllib.robotparser” module uses a urlunparse(urlparse()) combination to normalize URLs, so had to be changed. This is a backwards incompatibility of this proposal. |
|
Date |
User |
Action |
Args |
2015-03-13 00:41:01 | martin.panter | set | recipients:
+ martin.panter, orsenthil, demian.brecht, soilandreyes |
2015-03-13 00:41:00 | martin.panter | set | messageid: <1426207260.37.0.867917360519.issue22852@psf.upfronthosting.co.za> |
2015-03-13 00:41:00 | martin.panter | link | issue22852 messages |
2015-03-13 00:41:00 | martin.panter | create | |
|