This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author davechallis
Recipients davechallis, ezio.melotti
Date 2013-06-10.14:20:36
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1370874036.73.0.39673641693.issue18183@psf.upfronthosting.co.za>
In-reply-to
Content
This occurred when attempting to decode invalid UTF-8 bytes using "errors='replace'", then attempting to lowercase the produced unicode string.

This was also tested in python 2.7, but it doesn't occur there.

Code to reproduce:

x = b'\xe2\xb3\x99\xb3\xd1\x9f\xe0vjGd|\x12\xf2\x84\xac\xae&$\xa4\xae+\xa4sbtf$&fG\xfb\xe6?.\xe2sbv\x14\xcb\x89\x98\xda\xd9\x99\xda\xb9d9\x1bY\x99\xb7\xb3\x1b9\xa2y*B\xa3\xba\xefj&g\xe2\x92Et\x85~\xbf\x8a\xe3\x919\x8bvc\xfb#$$.\xber6D&b.#4\xa4.\x13RtI\x10\xed\x9c\xd0\x98\xb8\x18\x91\x99\\\nC\x13\x8dV\xccL\xf4\x89\x9c\x90'

x = x.decode('utf-8', errors='replace')

x.lower()


Output:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: invalid maximum character passed to PyUnicode_New
History
Date User Action Args
2013-06-10 14:20:36davechallissetrecipients: + davechallis, ezio.melotti
2013-06-10 14:20:36davechallissetmessageid: <1370874036.73.0.39673641693.issue18183@psf.upfronthosting.co.za>
2013-06-10 14:20:36davechallislinkissue18183 messages
2013-06-10 14:20:36davechalliscreate