classification
Title: Unsigned Integer Overflow in sre_lib.h
Type: crash Stage: resolved
Components: Regular Expressions Versions: Python 3.6, Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: Pointers point out of array bound in _sre.c
View: 18684
Assigned To: serhiy.storchaka Nosy List: BreamoreBoy, bee13oy, ezio.melotti, mrabarnett, serhiy.storchaka, vstinner
Priority: normal Keywords:

Created on 2015-07-05 02:49 by bee13oy, last changed 2015-07-07 13:34 by serhiy.storchaka. This issue is now closed.

Messages (10)
msg246290 - (view) Author: bee13oy (bee13oy) Date: 2015-07-05 02:49
I found an Unsigned Integer Overflow in sre_lib.h.

Tested on En Windows 7 x86 + Python 3.4.3 / Python 3.5.0b2

Crash:
------
(1a84.16b0): Access violation - code c0000005 (!!! second chance !!!)
eax=00000002 ebx=0038f40c ecx=00000002 edx=0526cbb8 esi=83e0116b edi=c3e011eb
eip=58bcfa53 esp=0038f384 ebp=0038f394 iopl=0         nv up ei ng nz na po cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010283
python35+0x1fa53:
58bcfa53 380e            cmp     byte ptr [esi],cl          ds:002b:83e0116b=??

code: 
------
58bcfa3d 8b4a04          mov     ecx,dword ptr [edx+4]
58bcfa40 0fb6c1          movzx   eax,cl
58bcfa43 3bc1            cmp     eax,ecx
58bcfa45 0f8593000000    jne     python35+0x1fade (58bcfade)
58bcfa4b 3bf7            cmp     esi,edi
58bcfa4d 0f838b000000    jae     python35+0x1fade (58bcfade)
58bcfa53 380e            cmp     byte ptr [esi],cl          ds:002b:83e0116b=??
58bcfa55 0f8583000000    jne     python35+0x1fade (58bcfade)

stack:
------
0:000> kb
ChildEBP RetAddr  Args to Child              
WARNING: Stack unwind information not available. Following frames may be wrong.
0038f394 58bcfedf 40000080 0038f40c 83e0116c python35+0x1fa53
0038f3c0 58bd0f58 00000000 06016508 0526cb60 python35+0x1fedf
0038f400 58bd5039 58e40c58 83e0116b 03e01158 python35+0x20f58
0038f480 58bd76b2 00000000 7fffffff 00000000 python35+0x25039
0038f4a4 58c925cf 0526cb60 0528a4d0 00000000 python35+0x276b2
0038f4c4 58cf3633 06016508 0528a4d0 00000000 python35!PyCFunction_Call+0x2f
0038f4f8 58cf0b05 05840f90 03e0ab90 00000001 python35!PyEval_GetFuncDesc+0x373
0038f570 58cf3791 03e0ab90 00000000 00000001 python35!PyEval_EvalFrameEx+0x22d5
0038f594 58cf3692 00000001 00000001 00000000 python35!PyEval_GetFuncDesc+0x4d1
0038f5c8 58cf0b05 03e08de0 0012e850 00000000 python35!PyEval_GetFuncDesc+0x3d2
0038f640 58cf25bb 0012e850 00000000 065feff0 python35!PyEval_EvalFrameEx+0x22d5
0038f68c 58d29302 03dcfaa8 00000000 00000000 python35!PyEval_EvalFrameEx+0x3d8b
0038f6c8 58d29195 03dcfaa8 03dcfaa8 0038f790 python35!PyRun_FileExFlags+0x1f2
0038f6f4 58d2820a 05994fc8 052525a8 00000101 python35!PyRun_FileExFlags+0x85
0038f738 58bfe9f7 05994fc8 052525a8 00000001 python35!PyRun_SimpleFileExFlags+0x20a
0038f764 58bff32b 0038f790 5987b648 5987cc94 python35!Py_hashtable_copy+0x5e17
0038f808 1c6f11df 00000003 05796f70 05210f50 python35!Py_Main+0x90b

source code:

LOCAL(Py_ssize_t)
SRE(search)(SRE_STATE* state, SRE_CODE* pattern)
{
    SRE_CHAR* ptr = (SRE_CHAR *)state->start;
    SRE_CHAR* end = (SRE_CHAR *)state->end;
    Py_ssize_t status = 0;
    Py_ssize_t prefix_len = 0;
    Py_ssize_t prefix_skip = 0;
    SRE_CODE* prefix = NULL;
    SRE_CODE* charset = NULL;
    SRE_CODE* overlap = NULL;
    int flags = 0;

    if (pattern[0] == SRE_OP_INFO) {
        /* optimization info block */
        /* <INFO> <1=skip> <2=flags> <3=min> <4=max> <5=prefix info>  */

        flags = pattern[2];

        if (pattern[3] > 1) {
            /* adjust end point (but make sure we leave at least one
               character in there, so literal search will work) */
            end -= pattern[3] - 1;
            if (end <= ptr)
                end = ptr;
        }
		...
	}
	
	...
	
	} else
		/* general case */
		while (ptr <= end) {
			TRACE(("|%p|%p|SEARCH\n", pattern, ptr));
			state->start = state->ptr = ptr++;
			status = SRE(match)(state, pattern, 0);
			if (status != 0)
				break;
    }
}

SRE(count)(SRE_STATE* state, SRE_CODE* pattern, Py_ssize_t maxcount)
{
    SRE_CODE chr;
    SRE_CHAR c;
    SRE_CHAR* ptr = (SRE_CHAR *)state->ptr;
    SRE_CHAR* end = (SRE_CHAR *)state->end;
    Py_ssize_t i;

    /* adjust end */
    if (maxcount < end - ptr && maxcount != SRE_MAXREPEAT)
        end = ptr + maxcount;
		
    ...
	
#if SIZEOF_SRE_CHAR < 4
        if ((SRE_CODE) c != chr)
            ; /* literal can't match: doesn't fit in char width */
        else
#endif
        while (ptr < end && *ptr == c) // crash here, ptr points to an unreadable memory.
            ptr++;
        break;
}

poc code:
---cut----
import re

pattern = "([\\2]{1073741952})"
regexp = re.compile(r''+pattern+'')
sgroup = regexp.search(pattern)

---cut---

1.) In SRE(search), pattern[3] is equal to 1073741952 (0x400000080). What's more, the program doesn't limit the max size, which causes the end pointer is pointed to an invalid and large address( bigger than ptr).
2.) Then program run while (ptr <= end) { state->start = state->ptr = ptr++,..} , but state->end pointer is the orignal value.3.) After a while's running, it comes to SRE(count) and adjust the end, end - ptr = 0x7fffffff, which is largger than 0x400000080, ptr has been pointed to an invalid address.
3.) After a while, it runs to function SRE(count) and adjust the end, end - ptr = 0x7fffffff, which is largger than 0x400000080, ptr has been pointed to an invalid address.
msg246294 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-05 05:13
Does the patch for issue18684 fix this issue?
msg246313 - (view) Author: bee13oy (bee13oy) Date: 2015-07-05 13:31
I didn't test that path, I just found this bug in python3.4.3 by fuzzing re module, and tested Python 3.5.0b2 on windows 7 x86, It has the same problem.
msg246314 - (view) Author: bee13oy (bee13oy) Date: 2015-07-05 13:40
I have just tested python 2.7.10 on Windows 7 x86 with the poc code, it will also result in python crash.
msg246325 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-05 17:04
Not having Windows I can't reproduce the crash. Someone should test if the patch for issue18684 fixes this issue and doesn't introduce other regressions.
msg246339 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2015-07-05 21:09
Fixed by the patch on issue18684, see also my comments there.
msg246347 - (view) Author: bee13oy (bee13oy) Date: 2015-07-06 03:23
I tested this path, and It really fixed this issue. But I'm wondering Python 2.7.10 was released at May 23, 2015, and this path was created at March 22,2015. So does it mean, Python 2.7.10/3.5.0b2 was compiled and released without applying this path?
msg246354 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-06 10:53
Yes, this patch was not applied because it had no visible effect on Linux. Now, with your report, there is a case on Windows.
msg246413 - (view) Author: bee13oy (bee13oy) Date: 2015-07-07 13:15
Thank you. I got it.

2015-07-06 18:53 GMT+08:00 Serhiy Storchaka <report@bugs.python.org>:

>
> Serhiy Storchaka added the comment:
>
> Yes, this patch was not applied because it had no visible effect on Linux.
> Now, with your report, there is a case on Windows.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue24566>
> _______________________________________
>
msg246415 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-07 13:34
Thank you for your report. Without your example the patch would postponed indefinitely.
History
Date User Action Args
2015-07-07 13:34:34serhiy.storchakasetmessages: + msg246415
2015-07-07 13:15:46bee13oysetmessages: + msg246413
2015-07-06 10:53:03serhiy.storchakasetmessages: + msg246354
2015-07-06 10:52:50serhiy.storchakasetmessages: - msg246353
2015-07-06 10:52:21serhiy.storchakasetstatus: open -> closed

assignee: serhiy.storchaka
versions: + Python 2.7, Python 3.6
messages: + msg246353
superseder: Pointers point out of array bound in _sre.c
resolution: duplicate
stage: resolved
2015-07-06 03:23:06bee13oysetmessages: + msg246347
2015-07-05 21:09:53BreamoreBoysetnosy: + BreamoreBoy
messages: + msg246339
2015-07-05 17:04:29serhiy.storchakasetmessages: + msg246325
2015-07-05 13:40:26bee13oysetmessages: + msg246314
2015-07-05 13:31:31bee13oysetmessages: + msg246313
2015-07-05 05:13:32serhiy.storchakasetmessages: + msg246294
2015-07-05 02:59:28ezio.melottisetnosy: + vstinner, serhiy.storchaka
type: crash
2015-07-05 02:49:21bee13oycreate