Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redirect is not working correctly in urllib2 #58340

Closed
janik mannequin opened this issue Feb 26, 2012 · 7 comments
Closed

Redirect is not working correctly in urllib2 #58340

janik mannequin opened this issue Feb 26, 2012 · 7 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@janik
Copy link
Mannequin

janik mannequin commented Feb 26, 2012

BPO 14132
Nosy @facundobatista, @gpshead, @orsenthil, @ezio-melotti, @karlcow, @vadmium
Files
  • urllib2_redirect_fix.patch: The possible bug fix
  • urllib2_redirect_fix.2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-05-16.09:45:40.623>
    created_at = <Date 2012-02-26.15:15:31.398>
    labels = ['type-bug', 'library']
    title = 'Redirect is not working correctly in urllib2'
    updated_at = <Date 2020-01-03.21:29:39.575>
    user = 'https://bugs.python.org/janik'

    bugs.python.org fields:

    activity = <Date 2020-01-03.21:29:39.575>
    actor = 'ned.deily'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-05-16.09:45:40.623>
    closer = 'martin.panter'
    components = ['Library (Lib)']
    creation = <Date 2012-02-26.15:15:31.398>
    creator = 'janik'
    dependencies = []
    files = ['24647', '39501']
    hgrepos = []
    issue_num = 14132
    keywords = ['patch']
    message_count = 7.0
    messages = ['154356', '154357', '183577', '243875', '244080', '265514', '265683']
    nosy_count = 8.0
    nosy_names = ['facundobatista', 'gregory.p.smith', 'orsenthil', 'ezio.melotti', 'karlcow', 'python-dev', 'martin.panter', 'janik']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue14132'
    versions = ['Python 2.7', 'Python 3.5', 'Python 3.6']

    @janik
    Copy link
    Mannequin Author

    janik mannequin commented Feb 26, 2012

    When only the query string is sent by the server as the redirect url, urllib2 redirects to incorrect address.

    Error is occuring on the page http://kniznica.uniza.sk/opac. Server sends only the query string part of the uri in the Location header (ie. ?fs=04D07295D4434730A51C95A9F1727373&fn=main). Path is then incorrectly stripped from the original url, and urllib2 redirects to http://kniznica.uniza.sk/?fs=04D07295D4434730A51C95A9F1727373&fn=main.

    The error was introduced in the fix of the issue bpo-2464. I think, the attached patch is fixing the error (it is working for me).

    @janik janik mannequin added type-bug An unexpected behavior, bug, or error stdlib Python modules in the Lib dir labels Feb 26, 2012
    @janik
    Copy link
    Mannequin Author

    janik mannequin commented Feb 26, 2012

    I forgot to mention that the correct url in the example would be http://kniznica.uniza.sk/opac?fs=04D07295D4434730A51C95A9F1727373&fn=main.

    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Mar 6, 2013

    → curl -sI http://kniznica.uniza.sk/opac

    HTTP/1.1 302 Moved Temporarily
    Date: Wed, 06 Mar 2013 03:23:06 GMT
    Server: Indy/9.0.50
    Content-Type: text/html
    Location: ?fs=C79F09C9F1304E7AA4FF7C211BEA2B9B&fn=main

    → python3.3

    Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 01:25:11) 
    [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import urllib.parse
    >>> urllib.parse.urlparse("http://kniznica.uniza.sk/opac")
    ParseResult(scheme='http', netloc='kniznica.uniza.sk', path='/opac', params='', query='', fragment='')
    >>> urllib.parse.urlparse("?fs=C79F09C9F1304E7AA4FF7C211BEA2B9B&fn=main")
    ParseResult(scheme='', netloc='', path='', params='', query='fs=C79F09C9F1304E7AA4FF7C211BEA2B9B&fn=main', fragment='')

    Redirection is defined at
    http://hg.python.org/cpython/file/5e294202f93e/Lib/urllib/request.py#l643

    @vadmium
    Copy link
    Member

    vadmium commented May 23, 2015

    The proposed patch looks good to me. A test case would be nice though.

    Also I wonder why the “malformed URL” logic needs to be in urllib.request. Surely it either belongs in urljoin(), or in the underlying http.client. That needs more thought, but either way the current patch is a definite improvement.

    @vadmium
    Copy link
    Member

    vadmium commented May 26, 2015

    urllib2_redirect_fix.2.patch adds a test.

    I was tempted to remove the whole block of code setting the path to “/”, but there is one minor disadvantage: if a redirect points to a so-called “malformed” URL without any path component, like “http://example.net” or “http://example.net?query”, geturl() would return this URL verbatim.

    @vadmium
    Copy link
    Member

    vadmium commented May 14, 2016

    I will try to commit this soon

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 16, 2016

    New changeset 52a7f580580c by Martin Panter in branch '3.5':
    Issue bpo-14132: Fix redirect handling when target is just a query string
    https://hg.python.org/cpython/rev/52a7f580580c

    New changeset 789a3f87bde1 by Martin Panter in branch '2.7':
    Issue bpo-14132: Fix redirect handling when target is just a query string
    https://hg.python.org/cpython/rev/789a3f87bde1

    New changeset 841a9a3f3cf6 by Martin Panter in branch 'default':
    Issue bpo-14132, Issue bpo-17214: Merge two redirect handling fixes from 3.5
    https://hg.python.org/cpython/rev/841a9a3f3cf6

    @vadmium vadmium closed this as completed May 16, 2016
    @Joony898i Joony898i mannequin added topic-XML and removed stdlib Python modules in the Lib dir labels Jan 3, 2020
    @Joony898i Joony898i mannequin changed the title Redirect is not working correctly in urllib2 SEO Services Development & PHP development Jan 3, 2020
    @Joony898i Joony898i mannequin added performance Performance or resource usage and removed type-bug An unexpected behavior, bug, or error labels Jan 3, 2020
    @orsenthil orsenthil changed the title SEO Services Development & PHP development Redirect is not working correctly in urllib2 Jan 3, 2020
    @ned-deily ned-deily added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error and removed topic-XML performance Performance or resource usage labels Jan 3, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants