Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO #70904

Closed
asottile mannequin opened this issue Apr 8, 2016 · 10 comments
Closed

wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO #70904

asottile mannequin opened this issue Apr 8, 2016 · 10 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@asottile
Copy link
Mannequin

asottile mannequin commented Apr 8, 2016

BPO 26717
Nosy @vadmium, @asottile
Files
  • patch
  • patch
  • patch
  • patch
  • simple_server.py.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-04-17.08:23:34.103>
    created_at = <Date 2016-04-08.20:48:05.576>
    labels = ['type-bug', 'library']
    title = 'wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO'
    updated_at = <Date 2016-04-20.14:34:55.073>
    user = 'https://github.com/asottile'

    bugs.python.org fields:

    activity = <Date 2016-04-20.14:34:55.073>
    actor = 'Anthony Sottile'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-04-17.08:23:34.103>
    closer = 'martin.panter'
    components = ['Library (Lib)']
    creation = <Date 2016-04-08.20:48:05.576>
    creator = 'Anthony Sottile'
    dependencies = []
    files = ['42402', '42403', '42404', '42405', '42531']
    hgrepos = []
    issue_num = 26717
    keywords = ['patch']
    message_count = 10.0
    messages = ['263043', '263044', '263048', '263050', '263054', '263055', '263056', '263596', '263818', '263844']
    nosy_count = 4.0
    nosy_names = ['python-dev', 'martin.panter', 'Anthony Sottile', '\xd0\x90\xd0\xbb\xd0\xb5\xd0\xba\xd1\x81\xd0\xb0\xd0\xbd\xd0\xb4\xd1\x80 \xd0\xad\xd1\x80\xd0\xb8']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue26717'
    versions = ['Python 3.5', 'Python 3.6']

    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented Apr 8, 2016

    Patch attached with test.

    In summary:

    A request to the url b'/\x80' appears to the application as a request to b'\xc2\x80' -- The issue being the latin1 decoded PATH_INFO is re-encoded as UTF-8 and then decoded as latin1

    (on the wire) b'\x80' -(decode latin1)-> u'\x80' -(encode utf-8)-> b'\xc2\x80' -(decode latin1)-> b'\xc2\x80'

    My patch cuts out the encode(utf-8)->decode(latin1)

    @asottile asottile mannequin added the stdlib Python modules in the Lib dir label Apr 8, 2016
    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented Apr 8, 2016

    A few typos in my previous comment, pressed enter too quickly, here's an updated comment:

    Patch attached with test.

    In summary:

    A request to the url b'/\x80' appears to the application as a request to b'/\xc2\x80' -- The issue being the latin1 decoded PATH_INFO is re-encoded as UTF-8 and then decoded as latin1

    (on the wire) b'\x80' -(decode latin1)-> u'\x80' -(encode utf-8)-> b'\xc2\x80' -(decode latin1)-> u'\xc2\x80'
    

    My patch cuts out the encode(utf-8)->decode(latin1):

    (on the wire) b'\x80' -(decode latin1) -> u'\x80'
    

    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented Apr 8, 2016

    Oops, broke b'/%80'.

    Here's a better fix that now takes:

    (on the wire) b'\x80' -(decode latin1)-> u'\x80' -(encode utf-8)-> b'\xc2\x80' -(decode latin1)-> u'\xc2\x80'
    

    to:

    (on the wire) b'\x80' -(decode latin1)-> u'\x80' -(encode latin1) -> b'\x80' -(decode latin1)-> u'\x80'
    

    @vadmium
    Copy link
    Member

    vadmium commented Apr 8, 2016

    I was going to say your original fix was the reverse of a change in r86146. But you seem to be fixing the problems before I express them :)

    For the fix I would suggest something like unquote(path, "latin-1") would be simpler. I left some other review comments about the tests.

    @vadmium vadmium added the type-bug An unexpected behavior, bug, or error label Apr 8, 2016
    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented Apr 9, 2016

    Updates after review.

    @vadmium
    Copy link
    Member

    vadmium commented Apr 9, 2016

    Thanks, this version looks pretty good to me.

    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented Apr 9, 2016

    Forgot to remove the pyver code (leaning a bit too much on pre-commit)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 17, 2016

    New changeset 1f2cfcd5a83f by Martin Panter in branch '3.5':
    Issue bpo-26717: Stop encoding Latin-1-ized WSGI paths with UTF-8
    https://hg.python.org/cpython/rev/1f2cfcd5a83f

    New changeset 815a4ac67e68 by Martin Panter in branch 'default':
    Issue bpo-26717: Merge wsgiref fix from 3.5
    https://hg.python.org/cpython/rev/815a4ac67e68

    @vadmium vadmium closed this as completed Apr 17, 2016
    @ghost
    Copy link

    ghost commented Apr 20, 2016

    Why wsgiref uses latin1? It must use utf-8.

    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented Apr 20, 2016

    PEP-3333 states that environ variables are str variables decoded using
    latin1:
    https://www.python.org/dev/peps/pep-3333/#id19

    Therefore, to get the original bytes, one must encode using latin1
    On Apr 20, 2016 3:46 AM, "Александр Эри" <report@bugs.python.org> wrote:

    Александр Эри added the comment:

    Why wsgiref uses latin1? It must use utf-8.

    ----------
    keywords: +patch
    nosy: +Александр Эри
    Added file: http://bugs.python.org/file42531/simple_server.py.diff


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue26717\>


    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants