This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author steve.dower
Recipients BreamoreBoy, David.Sankel, Drekin, Jonitis, akira, amaury.forgeotdarc, christian.heimes, christoph, davidsarah, dead1ne, escapewindow, ezio.melotti, flox, giampaolo.rodola, hippietrail, lemburg, lilydjwg, mark, martin.panter, mhammond, ncoghlan, ned.deily, paul.moore, piotr.dobrogost, pitrou, santoso.wijaya, smerlin, ssbarnea, steve.dower, stijn, terry.reedy, tim.golden, tzot, v+python, wiz21
Date 2016-08-15.04:52:28
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1471236748.78.0.102773005894.issue1602@psf.upfronthosting.co.za>
In-reply-to
Content
For more info here, cgi.parse has code like this:

def parse(fp, ...):
    if fp is None:
        fp = sys.stdin

    encoding = getattr(fp, 'encoding', 'latin-1')

    # later on...

    return urllib.parse.parse_qs(a_str, encoding=encoding, ...)

As an easy hack, I added this after assigning encoding:

    if len(' '.encode(encoding, errors='replace')) > 1:
        encoding = 'latin-1'

I have no idea if this is a good idea or not. The current behaviour of mojibake in the parsed result is certainly worse, since the choice of utf-16-le is entirely contained within the parse() function.
History
Date User Action Args
2016-08-15 04:52:28steve.dowersetrecipients: + steve.dower, lemburg, mhammond, terry.reedy, paul.moore, tzot, amaury.forgeotdarc, ncoghlan, pitrou, giampaolo.rodola, christian.heimes, tim.golden, mark, ned.deily, christoph, ezio.melotti, v+python, hippietrail, ssbarnea, flox, davidsarah, santoso.wijaya, akira, BreamoreBoy, David.Sankel, smerlin, lilydjwg, martin.panter, piotr.dobrogost, Drekin, wiz21, stijn, Jonitis, escapewindow, dead1ne
2016-08-15 04:52:28steve.dowersetmessageid: <1471236748.78.0.102773005894.issue1602@psf.upfronthosting.co.za>
2016-08-15 04:52:28steve.dowerlinkissue1602 messages
2016-08-15 04:52:28steve.dowercreate