Author rhansen
Recipients ezio.melotti, lemburg, rhansen
Date 2010-01-04.23:56:30
SpamBayes Score 4.51021e-06
Marked as misclassified No
Message-id <1262649393.34.0.802612836062.issue7615@psf.upfronthosting.co.za>
In-reply-to
Content
I thought about raw_unicode_escape more, and there's a way to escape quotes:  use unicode escape sequences (e.g., ur'\u0027').  I've attached a new patch that does the following:

 * backslash-escapes single quotes when encoding with the unicode_escape codec (the original subject of this bug)
 * replaces single quotes with \u0027 when encoding with the raw_unicode_escape codec (a separate bug not related to the original report, but brought up in comments)
 * replaces backslashes with \u005c when encoding with the raw_unicode_escape codec (a separate bug not related to the original report)
 * fixes a corner-case bug where the UTF-16 surrogate pair decoding logic could read past the end of the provided Py_UNICODE character array (a separate bug not related to the original report)
 * eliminates redundant code in PyUnicode_EncodeRawUnicodeEscape() and unicodeescape_string()
 * general cleanup in unicodeescape_string()

The changes in the patch are non-trivial and have only been lightly tested.
History
Date User Action Args
2010-01-04 23:56:33rhansensetrecipients: + rhansen, lemburg, ezio.melotti
2010-01-04 23:56:33rhansensetmessageid: <1262649393.34.0.802612836062.issue7615@psf.upfronthosting.co.za>
2010-01-04 23:56:31rhansenlinkissue7615 messages
2010-01-04 23:56:31rhansencreate