Message246188
#Python logic error when deal with re and muti-threading
##Bug Description
When use re and multi-threading it will trigger the bug.
Bug type: `Logic Error`
Test Enviroment:
* `Windows 7 SP1 x64 + python 3.4.3`
* `Linux kali 3.14-kali1-amd64 + python 2.7.3 `
-----------------------------Normal Case------------------------
- 1. main-thread: join(timeout), wait for sub-thread finished -
- 2. sub-thread: while(1), an infinite loop -
----------------------------------------------------------------
Test Code:
#!/usr/bin/python
__author__ = 'bee13oy'
import re
import threading
timeout = 2
source = "(.*(.)?)*bcd\\t\\n\\r\\f\\a\\e\\071\\x3b\\$\\\\\?caxyz"
def run(source):
while(1):
print("test1")
def handle():
try:
t = threading.Thread(target=run,args=(source,))
t.setDaemon(True)
t.start()
t.join(timeout)
print("thread finished...It's an normal case!\n")
except:
print("exception ...\n")
handle()
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-----------------------------Bug Case-----------------------------------------------------------------------------
- 1. main-thread: join(timeout), wait for sub-thread finished -
- 2. sub-thread: 1)we construct the special pattern "(.*(.)?)*bcd\\t\\n\\r\\f\\a\\e\\071\\x3b\\$\\\\\?caxyz" -
2)regexp.search() can't deal with it, and hang up -
3)join(timeout), and the sub-thread was over time, at this time, main-thread should have got -
the control of the program. But it didn't. -
------------------------------------------------------------------------------------------------------------------
POC:
#!/usr/bin/python
__author__ = 'bee13oy'
import re
import os
import threading
timeout = 2
source = "(.*(.)?)*bcd\\t\\n\\r\\f\\a\\e\\071\\x3b\\$\\\\\?caxyz"
def run(source):
regexp = re.compile(r''+source+'')
sgroup = regexp.search(source)
def handle():
try:
t = threading.Thread(target=run,args=(source,))
t.setDaemon(True)
t.start()
t.join(timeout)
print("finished...\n")
except:
print("exception ...\n")
handle()
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
----------------------------------------------------------------
- Bug Analyze -
----------------------------------------------------------------
When we use Python multithreading, and use `join(timeout)` to wait until the **thread terminates** or **timed out**.
1. In normal case, I run a while() in sub-thread, the main thread will get the control of the program after the sub-thread is timed out.
2. In our POC, even the sub-thread timed out, the main thread still can't execute continue. After analyzing, I found the main thread trapped into an infinite loop.
At first, it will run into the sub-thread, but it can't end normally.
At this time, join(timeout) will wait for the sub-thread return or timed out, and try to call timed out function in order that main thread can get the control of the program.
The bug is that the sub-thread was into an infinite loop and the main-thread was into an infinite loop too, which causes the program to be hang up.
By analyzing the source code of Python, we found that:
- sub-thread is into an infinite loop (code block 0)
- main-thread is into an infinite loop (code block 1)
-----------------------------code block 0----------------------------------
- the following code is where sub-thread trapped into an infinite loop: -
---------------------------------------------------------------------------
the following code is where the sub-thread trapped into an **infinite loop**:
```
LOCAL(Py_ssize_t)
SRE(match)(SRE_STATE* state, SRE_CODE* pattern, int match_all)
{
SRE_CHAR* end = (SRE_CHAR *)state->end;
Py_ssize_t alloc_pos, ctx_pos = -1;
Py_ssize_t i, ret = 0;
Py_ssize_t jump;
unsigned int sigcount=0;
SRE(match_context)* ctx;
SRE(match_context)* nextctx;
TRACE(("|%p|%p|ENTER\n", pattern, state->ptr));
DATA_ALLOC(SRE(match_context), ctx);
ctx->last_ctx_pos = -1;
ctx->jump = JUMP_NONE;
ctx->pattern = pattern;
ctx->match_all = match_all;
ctx_pos = alloc_pos;
.....
/* Cycle code which will never return*/
for (;;) {
++sigcount;
if ((0 == (sigcount & 0xfff)) && PyErr_CheckSignals())
RETURN_ERROR(SRE_ERROR_INTERRUPTED);
switch (*ctx->pattern++) {
case SRE_OP_MARK:
/* set mark */
/* <MARK> <gid> */
TRACE(("|%p|%p|MARK %d\n", ctx->pattern,
ctx->ptr, ctx->pattern[0]));
.....
}
```
-----------------------------code block 1----------------------------------
- the following code is where main-thread trapped into an infinite loop: -
---------------------------------------------------------------------------
static void take_gil(PyThreadState *tstate)
{
int err;
if (tstate == NULL)
Py_FatalError("take_gil: NULL tstate");
err = errno;
MUTEX_LOCK(gil_mutex);
if (!_Py_atomic_load_relaxed(&gil_locked))
goto _ready;
/*Cycle code which will never return*/
while (_Py_atomic_load_relaxed(&gil_locked)) {
int timed_out = 0;
unsigned long saved_switchnum;
saved_switchnum = gil_switch_number;
COND_TIMED_WAIT(gil_cond, gil_mutex, INTERVAL, timed_out);
/* If we timed out and no switch occurred in the meantime, it is time
to ask the GIL-holding thread to drop it. */
if (timed_out &&
_Py_atomic_load_relaxed(&gil_locked) &&
gil_switch_number == saved_switchnum) {
SET_GIL_DROP_REQUEST();
}
}
.....
} |
|
Date |
User |
Action |
Args |
2015-07-03 15:51:38 | bee13oy | set | recipients:
+ bee13oy, ezio.melotti, mrabarnett, r.david.murray |
2015-07-03 15:51:38 | bee13oy | set | messageid: <1435938698.78.0.15393660912.issue24555@psf.upfronthosting.co.za> |
2015-07-03 15:51:38 | bee13oy | link | issue24555 messages |
2015-07-03 15:51:37 | bee13oy | create | |
|