Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http_error_302() crashes with 'HTTP/1.1 400 Bad Request #41628

Closed
pristine777 mannequin opened this issue Feb 27, 2005 · 9 comments
Closed

http_error_302() crashes with 'HTTP/1.1 400 Bad Request #41628

pristine777 mannequin opened this issue Feb 27, 2005 · 9 comments
Assignees
Labels
easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@pristine777
Copy link
Mannequin

pristine777 mannequin commented Feb 27, 2005

BPO 1153027
Nosy @orsenthil, @vstinner, @devdanzin
Files
  • 302_with_spaces.diff: Replace spaces with "%20"
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/orsenthil'
    closed_at = <Date 2009-05-05.18:42:12.480>
    created_at = <Date 2005-02-27.20:16:58.000>
    labels = ['easy', 'type-bug', 'library']
    title = "http_error_302() crashes with 'HTTP/1.1 400 Bad Request"
    updated_at = <Date 2019-04-10.10:19:19.597>
    user = 'https://bugs.python.org/pristine777'

    bugs.python.org fields:

    activity = <Date 2019-04-10.10:19:19.597>
    actor = 'vstinner'
    assignee = 'orsenthil'
    closed = True
    closed_date = <Date 2009-05-05.18:42:12.480>
    closer = 'orsenthil'
    components = ['Library (Lib)']
    creation = <Date 2005-02-27.20:16:58.000>
    creator = 'pristine777'
    dependencies = []
    files = ['12989']
    hgrepos = []
    issue_num = 1153027
    keywords = ['easy']
    message_count = 9.0
    messages = ['60681', '60682', '60683', '81430', '81474', '81491', '87269', '87271', '339843']
    nosy_count = 8.0
    nosy_names = ['jhylton', 'jepler', 'jjlee', 'pristine777', 'robzed', 'orsenthil', 'vstinner', 'ajaksu2']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue1153027'
    versions = ['Python 2.6', 'Python 2.7']

    @pristine777
    Copy link
    Mannequin Author

    pristine777 mannequin commented Feb 27, 2005

    I was able to get to a website by using both IE and
    FireFox but my Python code kept giving HTTP 400 Bad
    request error. To debug, I set set_http_debuglevel(1) as
    in the following code:

    hh = urllib2.HTTPHandler() 
    hh.set_http_debuglevel(1)
    opener = urllib2.build_opener
    (hh,urllib2.HTTPCookieProcessor(self.cj))

    The printed debug messages show that this crash
    happens when there is a space in the redirected
    location. Here's a cut-and-paste of the relevant debug
    messages (note the line starting with send that
    http_error_302 is sending):

    reply: 'HTTP/1.1 302 Moved Temporarily\r\n'
    header: Connection: close
    header: Date: Sun, 27 Feb 2005 19:52:51 GMT
    header: Server: Microsoft-IIS/6.0
    <---other header data-->
    send: 'GET /myEmail/User?asOf=02/26/2005 11:38:12
    PM&
    ddn=87cb51501730
    <---remaining header data-->
    reply: 'HTTP/1.1 400 Bad Request\r\n'
    header: Content-Type: text/html
    header: Date: Sun, 27 Feb 2005 19:56:45 GMT
    header: Connection: close
    header: Content-Length: 20

    To fix this, I first tried to encode the redirected location
    in the function http_error_302() in urllib2 using the
    methods urllib.quote and urllib.urlencode but to no avail
    (they encode other data as well).

    A temporary solution that works is to replace any space
    in the redirected URL by'%20'. Below is a snippet of the
    function http_error_302 in urllib2 with this suggested fix:

    def http_error_302(self, req, fp, code, msg, headers):
            # Some servers (incorrectly) return multiple 
    Location headers
            # (so probably same goes for URI).  Use first 
    header.
            if 'location' in headers:
                newurl = headers.getheaders('location')[0]
            elif 'uri' in headers:
                newurl = headers.getheaders('uri')[0]
            else:
                return
            newurl=newurl.replace(' ','%20') # <<< TEMP FIX - 
    inserting this line temporarily fixes this problem
            newurl = urlparse.urljoin(req.get_full_url(), newurl)
           <--- remainder of this function -->

    Thanks!

    @pristine777 pristine777 mannequin added stdlib Python modules in the Lib dir labels Feb 27, 2005
    @jepler
    Copy link
    Mannequin

    jepler mannequin commented Mar 1, 2005

    Logged In: YES
    user_id=2772

    When the server sends the 302 response with 'Location:
    http://example.com/url%20with%20whitespace', urllib2 seems
    to work just fine.

    I believe based on reading rfc2396 that a URL that contains
    spaces must contain quoted spaces (%20) not literal spaces,
    because space is not an "unreserved character" [2.3] and
    "[d]ata must be escaped if it does not have a representation
    using an unreserved character" [2.4].

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented May 19, 2005

    Logged In: YES
    user_id=261020

    Sure, but if Firefox and IE do it, probably we should do the
    same.

    I think cookielib.escape_path(), or something similar
    (perhaps without the case normalisation) is probably the
    right thing to do. That's not part of any documented API; I
    suppose that function or a similar one should be added to
    module urlparse, and used by urllib2 and urllib when
    redirecting.

    @devdanzin
    Copy link
    Mannequin

    devdanzin mannequin commented Feb 9, 2009

    As always with urllib, the fix is trivial but adding a test is hard.

    @robzed
    Copy link
    Mannequin

    robzed mannequin commented Feb 9, 2009

    Appears to be the same as bpo-918368

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented Feb 9, 2009

    This bug refers to urllib2. bpo-918368 refers to urllib. It's the
    same problem in each case, though.

    @devdanzin devdanzin mannequin added type-bug An unexpected behavior, bug, or error labels Feb 12, 2009
    @devdanzin devdanzin mannequin added easy labels Apr 22, 2009
    @orsenthil orsenthil self-assigned this May 5, 2009
    @orsenthil orsenthil self-assigned this May 5, 2009
    @orsenthil
    Copy link
    Member

    fixed in revision 43132 ( smaller 'r' for the roundup to auto-hyperlink). :)

    @orsenthil
    Copy link
    Member

    Sorry, I meant fixed in revision 72351.

    @vstinner
    Copy link
    Member

    Sorry, I meant fixed in revision 72351.

    Commit in Git:

    commit 690ce9b
    Author: Senthil Kumaran <orsenthil@gmail.com>
    Date: Tue May 5 18:41:13 2009 +0000

    Fix for bpo-1153027, making Py3k changes similar to fix in bpo-918368.
    This will address:
    a) urllib/ in py3k,
    b) urllib in py2x is addressed by bpo-918368.
    c) urllib2 in py2x was already addressed in Revision 43132.
    

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants