This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: mutable urlparse return type
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: Nosy List: eric.araujo, ezio.melotti, mastahyeti, mhcptg, orsenthil, r.david.murray
Priority: normal Keywords: patch

Created on 2012-08-30 18:17 by mastahyeti, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
urlparse_patch.patch mastahyeti, 2012-08-30 18:17 review
Messages (13)
msg169474 - (view) Author: mastahyeti (mastahyeti) Date: 2012-08-30 18:17
This patch removes the inheritance from namedtuple and attempts to add the necessary methods to make it backwards compatible.

When parsing a url with urlparse.urlparse, the return type is non-mutable (named tuple). This is really inconvenient, because one of the most common (imop) use cases for urlparse is to parse a url, make an adjustment or change and then unparse it. Currently, something like this is required:

    import urlparse

    url = list(urlparse.urlparse('http://www.example.com/foo/bar?hehe=haha'))
    url[1] = 'python.com'
    new_url = urllib.urlunparse(url)

I think this is really clunky. Moving to a mutable return type is challenging because (to my knowledge) there are no types that are mutable and compatible with tuple. This patch removes the inheritance from namedtuple and attempts to add the necessary methods to make it backwards compatible. Does any one know of a better way to do this? It would be nice if there were a namedlist type that acted like namedtuple but was mutable. 

With these updates, urlparse can be used as follows:

    import urlparse

    url = list(urlparse.urlparse('http://www.example.com/foo/bar?hehe=haha'))
    url.netloc = 'www.python.com'
    urlparse.urlunparse(url)

I think this is much better. Let me know if you disagree...

Also, I ran the script through autopep8 because it was messy.

Also, I'm not sure if I'm supposed to duplicate this patch over to Python3. I can do that if necessary
msg169475 - (view) Author: mastahyeti (mastahyeti) Date: 2012-08-30 18:21
TYPO!!!

After my patch, urlparse can be used as such:

    import urlparse

    url = urlparse.urlparse('http://www.example.com/foo/bar?hehe=haha')
    url.netloc = 'www.python.com'
    urlparse.urlunparse(url)


The difference being that the result doesn't need to be casted to a list in order to be mutated...
msg169476 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-08-30 18:21
This is a new feature, so it can't go in 2.7.
msg169477 - (view) Author: mastahyeti (mastahyeti) Date: 2012-08-30 18:24
This is my first patch for python. Is there a feature freeze? Does it
need to go in Python3? Thanks.

On Thu, Aug 30, 2012 at 1:22 PM, Ezio Melotti <report@bugs.python.org> wrote:
>
> Ezio Melotti added the comment:
>
> This is a new feature, so it can't go in 2.7.
>
> ----------
> nosy: +ezio.melotti, orsenthil
> stage:  -> needs patch
> versions: +Python 3.4 -Python 2.7
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue15824>
> _______________________________________
msg169478 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-08-30 18:25
I think the first step is probably to get consensus on whether this is desirable or not.  That might require a trip to python-idea, or it might not :)

As for the patch itself, you should definitely *not* include any changes other than the ones you are proposing.  Otherwise reviewing the patch is very difficult.

As Ezio said, as a new feature this could only go into 3.4, so the patch should be against the default branch in the mercurial repository.
msg169479 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-08-30 18:38
On Thu, Aug 30, 2012 at 11:17 AM, mastahyeti <report@bugs.python.org> wrote:
>
> When parsing a url with urlparse.urlparse, the return type is non-mutable (named tuple). This is really inconvenient, because one of the most common (imop) use cases for urlparse is to parse a url, make an adjustment or change and then unparse it. Currently, something like this is required:

Not actually, using the namedtuple is a convenience and working
through way may help you to be generate your target url in a more
meaningful way. Also remember that we moved to namedtuple after
understanding that it is more meaningful to use that for parsed
result.  So, my vote for this proposal is -1. And if you need discuss
the strategies of how to use it, then you can ask over at python-help
or related lists.
msg169482 - (view) Author: mastahyeti (mastahyeti) Date: 2012-08-30 18:58
Senthil,
Can you give an example of how namedtuple would be more convenient? It
is definitely more convenient than an ordinary tuple, but its
inconvenient having its items not be assignable. As I showed in my
example above, it is usable as-is, but it is clunky. As David says
above, this obviously needs to be moved to another list for discussion
of whether the current behavior is desirable.

On Thu, Aug 30, 2012 at 1:38 PM, Senthil Kumaran <report@bugs.python.org> wrote:
>
> Senthil Kumaran added the comment:
>
> On Thu, Aug 30, 2012 at 11:17 AM, mastahyeti <report@bugs.python.org> wrote:
>>
>> When parsing a url with urlparse.urlparse, the return type is non-mutable (named tuple). This is really inconvenient, because one of the most common (imop) use cases for urlparse is to parse a url, make an adjustment or change and then unparse it. Currently, something like this is required:
>
> Not actually, using the namedtuple is a convenience and working
> through way may help you to be generate your target url in a more
> meaningful way. Also remember that we moved to namedtuple after
> understanding that it is more meaningful to use that for parsed
> result.  So, my vote for this proposal is -1. And if you need discuss
> the strategies of how to use it, then you can ask over at python-help
> or related lists.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue15824>
> _______________________________________
msg169483 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-08-30 19:03
Actually, Senthil is right.  What you want is the _replace method of namedtuple to satisfy your use case.
msg169485 - (view) Author: mastahyeti (mastahyeti) Date: 2012-08-30 19:10
I can live with that, it just seems that ordinary item assignment is
more pythonic....

On Thu, Aug 30, 2012 at 2:03 PM, R. David Murray <report@bugs.python.org> wrote:
>
> R. David Murray added the comment:
>
> Actually, Senthil is right.  What you want is the _replace method of namedtuple to satisfy your use case.
>
> ----------
> resolution:  -> works for me
> stage:  -> committed/rejected
> status: open -> closed
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue15824>
> _______________________________________
msg169487 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-08-30 19:16
Not in this case.  We are treating the URL as an immutable object, so the Pythonic thing to do is create new object of the same type with the change applied.  Similar to "abcd".replace('a', 'z') returning a new string.
msg169488 - (view) Author: mastahyeti (mastahyeti) Date: 2012-08-30 19:22
Hrmm. Okay. I concede.

On Thu, Aug 30, 2012 at 2:16 PM, R. David Murray <report@bugs.python.org> wrote:
>
> R. David Murray added the comment:
>
> Not in this case.  We are treating the URL as an immutable object, so the Pythonic thing to do is create new object of the same type with the change applied.  Similar to "abcd".replace('a', 'z') returning a new string.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue15824>
> _______________________________________
msg230715 - (view) Author: Matthew Hall (mhcptg) Date: 2014-11-05 21:57
I don't think having to call a method with a weird secret underscored name to update a value in a URL named tuple is very elegant. Neither is creating a handful of pointless objects to make one simple validator function like the one I had to code today. I would urge some reconsideration of this, like a way to get back a named yet mutable object when needed, instead of trying to force everybody to do this one way which isn't always that great.

def validate_url(url):
    parts = urlparse.urlparse(url.strip())
    # scheme, netloc, path, params, query, fragment
    # XXX: preserve backward compatibility w/ old code
    if not parts.scheme:
        parts = parts._replace(scheme='http', netloc=parts.path.strip('/'), path='')

    # remove params, query, and fragment
    # params is nearly never used anywhere
    # (NOTE: it does NOT mean the stuff after '?')
    # it actually means this http://domain/page.py;param1=foo?query1=bar
    # query and fragment are used but aren't helpful for our application
    parts = parts._replace(params='', query='', fragment='')

    if parts.scheme not in URI_SCHEMES:
        raise ValueError('scheme=%s is not valid' % parts.scheme)
    if '.' not in parts.netloc:
        raise ValueError('location=%s does not contain a domain' % parts.netloc)

    if len(parts.path) and not parts.path.startswith('/'):
        raise ValueError('path=%s appears invalid' % parts.path)
    elif not parts.path:
        parts=parts._replace(path='/')

    validated_url = parts.geturl()
    return validated_url, parts
msg230718 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-11-05 22:18
Think of it as immutable like a string is immutable.  The cases are exactly parallel (the string function is of course named 'replace' since it doesn't have to deal with the 'arbitrary attribute names' problem namedtuple does), except that it is much easier to address the parts of a url using the namedtuple.

_replace is not a "weird secrete method", it is part of the public API of namedtuple.  I agree that using '_' is unfortunate.  I would have preferred a name pattern like _replace_, to make it clearer that it is *not* a private method.
History
Date User Action Args
2022-04-11 14:57:35adminsetgithub: 60028
2014-11-05 22:18:47r.david.murraysetmessages: + msg230718
2014-11-05 21:57:40mhcptgsetnosy: + mhcptg
messages: + msg230715
2012-08-30 19:58:49eric.araujosetnosy: + eric.araujo
2012-08-30 19:22:39mastahyetisetmessages: + msg169488
2012-08-30 19:16:46r.david.murraysetmessages: + msg169487
2012-08-30 19:10:21mastahyetisetmessages: + msg169485
2012-08-30 19:03:40r.david.murraysetstatus: open -> closed
resolution: works for me
messages: + msg169483

stage: resolved
2012-08-30 18:58:07mastahyetisetmessages: + msg169482
2012-08-30 18:38:23orsenthilsetmessages: + msg169479
2012-08-30 18:25:34r.david.murraysetnosy: + r.david.murray

messages: + msg169478
stage: needs patch -> (no value)
2012-08-30 18:24:57mastahyetisetmessages: + msg169477
2012-08-30 18:22:00ezio.melottisetversions: + Python 3.4, - Python 2.7
nosy: + ezio.melotti, orsenthil

messages: + msg169476

stage: needs patch
2012-08-30 18:21:01mastahyetisetmessages: + msg169475
2012-08-30 18:17:44mastahyeticreate