classification
Title: urlparse "caches" parses regardless of encoding
Type: Stage:
Components: Unicode Versions: Python 2.4, Python 2.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: alexandre.vassalotti, kkinder, lemburg, palfrey
Priority: normal Keywords:

Created on 2005-10-04 17:57 by kkinder, last changed 2007-12-13 17:58 by alexandre.vassalotti. This issue is now closed.

Messages (4)
msg26504 - (view) Author: Ken Kinder (kkinder) Date: 2005-10-04 17:57
The issue can be summarized with this code:

>>> urlparse.urlparse(u'http://www.python.org/doc')
(u'http', u'www.python.org', u'/doc', '', '', '')
>>> urlparse.urlparse('http://www.python.org/doc')
(u'http', u'www.python.org', u'/doc', '', '', '')

Once the urlparse library has "cached" a URL, it stores
the resulting value of that cache regardless of
datatype. Notice that in the second use of urlparse, I
passed it a STRING and got back a UNICODE object.

This can be quite confusing when, as a developer, you
think you've already encoded all your objects, you use
urlparse, and all of a sudden you have unicode objects
again, when you expected to have strings.
msg26505 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2007-01-13 19:25
Unassigning: I don't use urlparse, so can't comment.
msg58345 - (view) Author: Tom Parker (palfrey) Date: 2007-12-10 13:35
Also effects Python 2.5.1 (tested on Debian python2.5 package version
2.5.1-5)
msg58541 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2007-12-13 17:58
Fixed in r59480.
History
Date User Action Args
2007-12-13 17:58:55alexandre.vassalottisetstatus: open -> closed
nosy: + alexandre.vassalotti
resolution: fixed
messages: + msg58541
2007-12-10 13:35:10palfreysetnosy: + palfrey
messages: + msg58345
versions: + Python 2.5
2005-10-04 17:57:39kkindercreate