classification
Title: sre "bytecode" verifier
Type: Stage:
Components: Versions: Python 3.0, Python 2.6
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: gvanrossum Nosy List: barry, gregory.p.smith, gvanrossum, jcea, pitrou, terry.reedy
Priority: high Keywords: patch

Created on 2008-08-01 17:53 by gvanrossum, last changed 2008-10-13 20:03 by jcea. This issue is now closed.

Files
File name Uploaded Description Edit
sre-verifier.diff gvanrossum, 2008-08-01 17:53 patch relative to Python 2.6
Messages (7)
msg70574 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-08-01 17:53
Attached is a verifier for the binary code used by the _sre module (this
is often called bytecode, though to distinguish it from Python bytecode
I put it in quotes).

I wrote this for Google App Engine, and am making the patch available as
open source under the Apache 2 license.  Below are the copyright
statement and license, for completeness.

Barry, I'm assigning this to you only so that you can decide whether
this is okay to check in before beta3.  The patch works for 2.6 as well
as for 3.0 (and even for 2.5, but I don't think it's worth changing that
-- or is it?).  If you agree (which I hope you will do), please assign
back to me with instructions and I'll submit it.  The performance impact
is minimal: it only runs when the compiled "bytecode" is passed from the
re compiler (written in Python) to the C code, during the sre
compilation stage.  All tests pass.  We've had this code in use in App
Engine since April without complaints.

# Copyright 2008 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

It's not necessary to include these copyrights and bytecode in the
source file.  Google has signed a contributor's agreement with the PSF
already.
msg70596 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2008-08-01 21:46
Based on my understanding of the above and PyDev discussions, I see the
arguments in favor of quick inclusion as being the following:
1. This will be user invisible, so it is not a new interface feature.
2. This will prevent possible interpreter crashes or other bad behavior
as a result of malformed SREcode.  So it could be considered a bug fix.
3. Google considered this enough of a potential problem to pre-emptively
fix it.  Now that that problem has been publicly exposed, other careful
users will expect it to be fixed and will find Python more attractive
when it has been.

If this is included in the next betas, the announcement of such might
say so and encourage re users to re-run any re-based test code.
msg70606 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-08-01 23:41
> 3. Google considered this enough of a potential problem to pre-emptively
> fix it.  Now that that problem has been publicly exposed, other careful
> users will expect it to be fixed and will find Python more attractive
> when it has been.
> 
> If this is included in the next betas, the announcement of such might
> say so and encourage re users to re-run any re-based test code.

I should add that the protection this offers is against attempts to
cause crashes by passing bad RE "bytecode" into the _sre.compile().

It is not possibly to generate such bad RE "bytecode" by writing an evil
regular expression; you must have access to the _sre module in order to
be able to exploit this vulnerability.  In other words, the
vulnerability is equivalent to having ctypes accessible.

Thus, only people who are worried about malicious use of ctypes should
be worried about this vulnerability.  Google's App Engine is one of
those (rare) places, since it lets anybody run their Python code in a
Google datacenter.  If you offer the ability to run arbitrary Python
code to strangers, you should worry about this.  Otherwise, there is no
reason to worry.
msg70682 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-08-04 04:01
+1  I'd like to see this make it in.
msg70692 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-04 09:35
Shouldn't there be any unit tests? :)
msg70695 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2008-08-04 12:26
Guido, this is fine for 3.0 and 2.6.  As Terry points out, it's not user
visible and it improves reliability.  I'm -0 on backporting it to 2.5,
but don't really feel strongly about that.

Go for it!
msg70729 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-08-05 03:40
Submitted to 2.6 as r65544.

Will propagate to 3.0 as it gets merged -- should be a perfect merge.

Antoine: the re module has tons of unittests; showing that attempts to
break in are thwarted would be pretty boring. ;-)
History
Date User Action Args
2008-10-13 20:03:59jceasetkeywords: patch, patch
nosy: + jcea
2008-08-05 03:40:36gvanrossumsetstatus: open -> closed
keywords: patch, patch
messages: + msg70729
2008-08-04 12:26:37barrysetversions: - Python 2.5
messages: + msg70695
priority: release blocker -> high
assignee: barry -> gvanrossum
keywords: patch, patch
resolution: accepted
2008-08-04 09:35:23pitrousetkeywords: patch, patch
nosy: + pitrou
messages: + msg70692
2008-08-04 04:01:18gregory.p.smithsetkeywords: patch, patch
nosy: + gregory.p.smith
messages: + msg70682
2008-08-01 23:41:13gvanrossumsetkeywords: patch, patch
messages: + msg70606
2008-08-01 21:46:30terry.reedysetkeywords: patch, patch
nosy: + terry.reedy
messages: + msg70596
2008-08-01 17:53:31gvanrossumcreate