Author ebarry
Recipients ebarry, ezio.melotti, vstinner
Date 2016-06-21.20:34:18
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1466541260.02.0.320668077341.issue27364@psf.upfronthosting.co.za>
In-reply-to
Content
Attached patch deprecates invalid escape sequences in unicode strings. The point of this is to prevent issues such as #27356 (and possibly other similar ones) in the future.

Without the patch:

>>> "hello \world"
'hello \\world'

With the patch:

>>> "hello \world"
DeprecationWarning: invalid escape sequence 'w'

I'll need some help (patch isn't mergeable yet):

test_doctest fails on my machine with the patch (and -W), and I don't know how to fix it. test_ast fails an assertion (!PyErr_Occurred() in PyObject_Call in abstract.c) when -W is on, and I also don't know how to fix it (I don't even know what causes it).

Of course, I went ahead and fixed all instances of invalid escape sequences in the stdlib (that I could find) so that no DeprecationWarning is encountered.

Lastly, I thought about also doing this to bytes, but I ran into some issues with some invalid escapes such as \u, and _codecs.escape_decode would trigger the warning when passed br"\8" (for example). Ultimately, I decided to leave bytes alone for now, since it's mostly on the lower-level side of things. If there's interest I can add it back.
History
Date User Action Args
2016-06-21 20:34:22ebarrysetrecipients: + ebarry, vstinner, ezio.melotti
2016-06-21 20:34:20ebarrysetmessageid: <1466541260.02.0.320668077341.issue27364@psf.upfronthosting.co.za>
2016-06-21 20:34:19ebarrylinkissue27364 messages
2016-06-21 20:34:19ebarrycreate