There's a reproducible bug in textio.c that causes a double DECREF on codecs. The conditions to trigger are probably rare in real life, so not remotely exploitable (sandbox escape is the worst I can think of on its own, and I'm not aware of any on 3.x):
* You need to create a TextIOWrapper wrapping a file-like object that only partially supports the protocol. For example, supporting readable(), writable(), and seekable() but not tell().
The crash I experience most of the time appears to be that the memory being reused, such that the PyObject ob_type field is no longer a valid pointer.
Affected:
Source 3.5.0a0 (latest default branch yesterday, 524a004e93dd)
Archlinux: 3.3.5 and 3.4.2
Ubuntu: 3.4.0
Unaffected:
Centos: 3.3.2
All 2.7 branch (doesn't contain the faulty commit)
Here's where it's introduced -- https://hg.python.org/cpython/rev/f3ec00d2b75e/#l5.76
/* Modules/_io/textio.c line 1064 */
Py_DECREF(codec_info);
/* does not set codec_info = NULL; */
...
if(...) goto error;
...
error:
Py_XDECREF(codec_info);
The attached script is close to minimal -- I think at most you can reduce by one TextIOWrapper instantiation. Sample stacktrace follows (which is after the corruption occurs, on subsequent access to v->ob_type (which is invalid).
#0 0x00000000004c8829 in PyObject_GetAttr (v=<unknown at remote 0x7ffff7eb9688>,
name='_is_text_encoding') at Objects/object.c:872
#1 0x00000000004c871d in _PyObject_GetAttrId (v=<unknown at remote 0x7ffff7eb9688>,
name=0x945d50 <PyId__is_text_encoding.10143>) at Objects/object.c:835
#2 0x00000000005c6674 in _PyCodec_LookupTextEncoding (
encoding=0x7ffff6f40220 "utf-8", alternate_command=0x6c2fcd "codecs.open()")
at Python/codecs.c:541
#3 0x000000000064286e in textiowrapper_init (self=0x7ffff7f9ecb8,
args=(<F at remote 0x7ffff6f40a18>,), kwds={'encoding': 'utf-8'})
at ./Modules/_io/textio.c:965
|