classification
Title: Warn about octal escapes > 0o377 in re
Type: enhancement Stage: resolved
Components: Library (Lib), Regular Expressions Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: ezio.melotti, mrabarnett, pitrou, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2014-09-08 11:07 by serhiy.storchaka, last changed 2014-09-23 20:28 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
re_octal_escape_overflow.patch serhiy.storchaka, 2014-09-08 11:07 review
re_octal_escape_overflow_raise.patch serhiy.storchaka, 2014-09-11 20:34 review
Messages (11)
msg226570 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-09-08 11:07
Currently the re module accepts octal escapes from \400 to \777, but ignore highest bit.

>>> re.search(r'\542', 'abc')
<_sre.SRE_Match object; span=(1, 2), match='b'>

This behavior looks surprising and is inconsistent with the regex module which preserve highest bit. Such escaping is not portable across different regular exception engines.

I propose to add a warning when octal escape value is larger than 0o377. Here is preliminary patch which adds UserWarning. Or may be better to emit DeprecationWarning and then replace it by ValueError in future releases?
msg226798 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-09-11 19:20
I think we should simply raise ValueError in 3.5. There's no reason to accept such invalid escapes.
msg226801 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-09-11 20:34
Well, here is a patch which makes re raise an exception on suspicious octals.
msg226809 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-09-12 07:36
re_octal_escape_overflow_raise.patch: you should write a subfunction to not repeat the error message 3 times.

+            if c > 0o377:

Hum, I never use octal. 255 instead of 0o377 would be less surprising :-p By the way, you should also check for negative numbers.

>>> -3 & 0xff
253

Before, "& 0xff" also converted negative numbers to positive in range 0..255.
msg226826 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-09-12 16:29
> By the way, you should also check for negative numbers.

Not in this case. You can't construct negative number from three octal digits.
msg227036 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-09-18 10:03
Warning or exception? This is a question.
msg227039 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-09-18 12:44
> Warning or exception? This is a question.

Using -Werror, warnings raise exceptions :-)
msg227040 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-09-18 13:17
This is an error, so it should really be an exception. There's no use case for being lenient, IMO.
msg227238 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-09-21 20:50
If this is error, should the patch be applied to maintained releases?
msg227386 - (view) Author: Roundup Robot (python-dev) Date: 2014-09-23 20:26
New changeset 3b32f495fb38 by Serhiy Storchaka in branch 'default':
Issue #22362: Forbidden ambiguous octal escapes out of range 0-0o377 in
https://hg.python.org/cpython/rev/3b32f495fb38
msg227387 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-09-23 20:28
Thanks Antoine and Victor for the review.
History
Date User Action Args
2014-09-23 20:28:00serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg227387

stage: patch review -> resolved
2014-09-23 20:26:03python-devsetnosy: + python-dev
messages: + msg227386
2014-09-21 20:50:24serhiy.storchakasetmessages: + msg227238
2014-09-18 13:17:06pitrousetmessages: + msg227040
2014-09-18 12:44:27vstinnersetmessages: + msg227039
2014-09-18 10:03:23serhiy.storchakasetassignee: serhiy.storchaka
messages: + msg227036
2014-09-12 16:29:26serhiy.storchakasetmessages: + msg226826
2014-09-12 07:36:30vstinnersetnosy: + vstinner
messages: + msg226809
2014-09-11 20:34:36serhiy.storchakasetfiles: + re_octal_escape_overflow_raise.patch

messages: + msg226801
2014-09-11 19:20:11pitrousetmessages: + msg226798
2014-09-08 11:07:20serhiy.storchakacreate