Message61268
urlparse module is incomplete. there is no convenient high-level function to parse url down into atomic chunks, urldecode query and bring it to array (or dictionary for that case), so that you can modify that dictionary and reassemble it into query again using nothing more than simple array manipulations.
This kind of function is universal and flexible in the same way that low-level API, but in comparison it allows to considerably speed up development process if the speech is about urls
I propose urlparseex(urlstring) function that will dissect the URL into dictionary of appropriate dictionaries or strings and decode all % entities
scheme 0 string
netloc 1 dictionary
username 1.1 string or whatever
password 1.2 string or whatever
server 1.3 hostname string
port 1.4 port integer
path 2 string
params 3 ordered dictionary of path components for the sake of reassembling them later (sorry, I have little pythons in my head to replace "ordered dictionary" with something more appropriate) where respective path part entry is also dictionary of parameters
query 4 dictionary
fragment 5 string
there must be also counterpart urlunparseex(dictionary) to reassemble url and reencode entities
Reasons behind the decision:
- 90% of time you need to decode % entities - this must be made by default (whoever need to let them encoded are in minor and may use other functions)
- atomic recursion format is needed to be able to easily change any url component and reassemble it back
- get simple swiss-army knife for high-level (read - logical) url operations in one module
http://docs.python.org/lib/module-urlparse.html
There is also this proposal below. It is a little bit different, but shows that after four years url handling problems are still actual.
http://sourceforge.net/tracker/index.php?func=detail&aid=600362&group_id=5470&atid=355470
|
|
Date |
User |
Action |
Args |
2008-01-20 09:59:52 | admin | link | issue1643370 messages |
2008-01-20 09:59:52 | admin | create | |
|