Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cookielib doesn't handle URLs with / in parameters #47954

Closed
andyk mannequin opened this issue Aug 27, 2008 · 25 comments
Closed

cookielib doesn't handle URLs with / in parameters #47954

andyk mannequin opened this issue Aug 27, 2008 · 25 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@andyk
Copy link
Mannequin

andyk mannequin commented Aug 27, 2008

BPO 3704
Nosy @gpshead, @orsenthil, @bitdancer
Files
  • cookielib.py: cookielib.py from 2.5.2 with my suggested fix
  • cookielib.diff: a diff generated from the above file
  • cookielib-querystring-fix.patch: patch with the fix and a unit test
  • issue3704-better_request_path.patch: Make request_path give us what we really want
  • issue3704.patch: request_path return value never includes query
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = <Date 2010-07-25.20:00:15.971>
    created_at = <Date 2008-08-27.17:01:40.333>
    labels = ['type-bug', 'library']
    title = "cookielib doesn't handle URLs with / in parameters"
    updated_at = <Date 2010-07-25.20:00:15.970>
    user = 'https://bugs.python.org/andyk'

    bugs.python.org fields:

    activity = <Date 2010-07-25.20:00:15.970>
    actor = 'gregory.p.smith'
    assignee = 'gregory.p.smith'
    closed = True
    closed_date = <Date 2010-07-25.20:00:15.971>
    closer = 'gregory.p.smith'
    components = ['Library (Lib)']
    creation = <Date 2008-08-27.17:01:40.333>
    creator = 'andyk'
    dependencies = []
    files = ['11271', '11282', '11283', '17187', '17285']
    hgrepos = []
    issue_num = 3704
    keywords = ['patch', 'needs review']
    message_count = 25.0
    messages = ['72035', '72066', '72724', '72726', '74824', '74827', '104803', '105407', '105424', '105433', '105458', '105459', '105460', '105500', '105501', '105535', '105549', '110684', '110799', '110806', '110837', '110944', '110984', '111463', '111556']
    nosy_count = 7.0
    nosy_names = ['gregory.p.smith', 'jjlee', 'tseaver', 'orsenthil', 'andyk', 'r.david.murray', 'BreamoreBoy']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = 'patch review'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue3704'
    versions = ['Python 3.1', 'Python 2.7']

    @andyk
    Copy link
    Mannequin Author

    andyk mannequin commented Aug 27, 2008

    cookielib doesn't handle URLs like "http://server/script?
    err=/base/error.html&ok=/base/ok.html", as
    CookieJar::_cookie_from_cookie_tuple uses rfind("/") to strip off the
    end of the URL, returning "http://server/script?
    err=/base/error.html&okc=/base" instead of "http://server/script".

    My suggested fix (attached, line 1465-1468) is to first strip off
    anything after "?" if present, then continue as with existing code.

    @andyk andyk mannequin added the type-bug An unexpected behavior, bug, or error label Aug 27, 2008
    @gpshead
    Copy link
    Member

    gpshead commented Aug 28, 2008

    attached is a patch with the suggested fix along with a unit test.

    @gpshead gpshead self-assigned this Aug 28, 2008
    @orsenthil
    Copy link
    Member

    The patch and tests look fine to me, Gregory. I verified it against the
    trunk. Should not we have it for py3k as well?

    @gpshead
    Copy link
    Member

    gpshead commented Sep 7, 2008

    yep it applies to all releases. anyways, it won't make 2.6/3.0 but it
    can be put into 2.5.3/2.6.1/3.0.1.

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented Oct 15, 2008

    Do we have an RFC 3986 URI parser in the stdlib now? It would be better
    to use that if so, but I don't see one. Failing that, an implementation
    of the relevant part of that RFC is only about four lines of code, so
    that would be better than naively looking for "?" (request_path should
    probably be changed at the same time).

    I'll try and add a patch that does that and check what Firefox does to
    see if I agree it's a bug.

    @orsenthil
    Copy link
    Member

    John, bpo-3647 tries relative url parsing and joins to be RFC3986
    compliance.

    @devdanzin devdanzin mannequin added the stdlib Python modules in the Lib dir label Feb 13, 2009
    @tseaver
    Copy link

    tseaver commented May 2, 2010

    I can confirm that the patch applies cleanly to the release26-maint
    branch, and that the updated test fails without the updated
    implementation.

    However, the entire approach seems wrong to me: the patched method
    has just called 'request_path', which already cracked the URL using
    urlparse.urlparse and put it back together again with
    urlparse.urlunparse.

    A better fix would be either to call urlparse.urlparse directly, or
    else to add an argument to 'request_path' which kept it from doing the
    unwanted reassembly.

    Attaching a patch which implements the latter strategy. Greg's new
    test passes with this patch applied in place of the part of his patch
    which touches Lib/cookielib.py.

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented May 9, 2010

    It looks to me that it's just request_path that's wrong, so no need to add extra arguments to that function. It should discard the query and fragment (still keeping the "parameters" -- using urlparse.urlsplit instead of urlparse.urlparse would make that simpler).

    request_path is only called in three places:

    • We're agreed that the default cookie path should omit the query (and fragment)
    • Netscape cookies aren't checked for path on setting cookies (.set_ok_path()), so the value of request_path isn't checked in that case
    • Netscape cookies are checked for path on returning cookies, but including the query & fragment will never make a difference to the .startswith check in .path_return_ok()

    Finally, even RFC 2965, which nobody cares about, and which does include the path check on setting the cookie, refers to RFC 2396 for the definition of request-URI, and both RFC 2396 and RFC 3986, which obsoletes it, agree that the path doesn't include the query (nor the fragment).

    Incidentally: the request_path function docstring claims to return the request-URI, but obviously the docstring should say it returns the path component of the request-URI.

    @tseaver
    Copy link

    tseaver commented May 10, 2010

    As long as we don't care about preserving backward compatibility, we
    could indeed just change the behavior of 'request_path'. It isn't
    documented as an API of the cookielib module, but it does have a
    docstring which promises certain semantics.

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented May 10, 2010

    I'll upload a patch when I'm back home (bugs.python.org went down yesterday).

    Will turn docstring into comment.

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented May 10, 2010

    Just re-read your comment, Tres. Since when do docstrings determine whether a stdlib function is public? If it's documented in the docs, it's public. If not, it's not. This function isn't, so it's not public. It's also not in __all__, FWLTW.

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented May 10, 2010

    Didn't bother changing docstring to comment, since that would be inconsistent with rest of module.

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented May 10, 2010

    FWIW, the "certain semantics" that request_path "promises" are 1. that it returns the RFC 2965 request-URI (which has never been true -- it returns the path component of the request-URI instead) and 2. that that request-URI is as defined in RFC 2965, and this bug is about fixing the function so that that's true for the case where the URI has a query component. So there's absolutely no reason for not changing the function.

    @bitdancer
    Copy link
    Member

    There is a reason, and that is that it may break existing code in the field relying on the current behavior. This is (unfortunately) true regardless of whether the function is public or private, though the fact that it is ostensibly private is likely to reduce the amount of breakage. The fix needs to be evaluated to try to guess whether breakage is likely, and if it is, it would not be a candidate for backport to 2.6 or 3.1.

    @bitdancer
    Copy link
    Member

    Hmm. I didn't read your comment carefully enough before I replied. I think you are saying that the bug fix is confined to the routine in question and doesn't change even its API, in which case the nature of the function doesn't come in to it at all. The fix still needs to be evaluated for backport, though, since it is a behavior change. (For example, in another issue Senthil fixed parsing of unknown schemes to be RFC compliant, but we may need to back that fix out of 2.6 since it changes behavior and will most likely break currently working 2.6 code).

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented May 11, 2010

    What specific breakage do you expect resulting from my patch being backported?

    There is no behaviour change here, except to the minimal extent that all bug fixes involve behaviour change. This seems a clear-cut backport candidate.

    It's not a surprise to me that changing module urlparse to be compliant with RFC 3986 breaks code -- I suggested adding a new module instead for that reason.

    @bitdancer
    Copy link
    Member

    I don't expect anything; I had written that it looked OK to me but apparently I accidentally deleted that text before posting. But I'm not someone who has ever programmed using cookielib so I wouldn't expect my opinion to count for too much.

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Jul 18, 2010

    Could we please have an update from people who have been involved on this issue as to whether it can be taken forward, closed as to no longer relevant, or whatever.

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented Jul 19, 2010

    My patch should be applied.

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Jul 19, 2010

    Patched the unit test, then ran the test before applying the fix which failed, after applying the fix the test ran successfully. Tested on Windows Vista 32 bit against 2.7 maintainance release. The patches are short and sweet, I see no reason why they can't go forward.

    @gpshead
    Copy link
    Member

    gpshead commented Jul 19, 2010

    jjlee's bpo-3704.patch has been committed to py3k (3.2) in r82985. It could still use backporting to 2.6, 2.7 and 3.1.

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented Jul 20, 2010

    Hmm, I never tested with Python 3, though I assume the forward-port was straightforward. The patch was created against (2.x) trunk, so indeed it should be committed there also.

    Deselecting 2.6 since I assume no more maintenance backports will be made to 2.6 aside from security fixes -- right?

    @bitdancer
    Copy link
    Member

    Until 2.6.6 is released bug fixes can still be backported to 2.6 but it is at the committer's option. Most likely this one won't be.

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Jul 24, 2010

    Can we have the patch committed to 2.7 and 3.1 please.

    @gpshead
    Copy link
    Member

    gpshead commented Jul 25, 2010

    release27-maint r83145
    release31-maint r83146
    release26-maint r83147

    @gpshead gpshead closed this as completed Jul 25, 2010
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants