This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: PyString_FromStringAndSize() to be considered unsafe
Type: security Stage:
Components: Interpreter Core Versions: Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: amaury.forgeotdarc, belopolsky, chmod007, gregory.p.smith, gvanrossum, iankko, jnferguson, psss
Priority: high Keywords:

Created on 2008-04-08 15:49 by jnferguson, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
python-2.5.2-zlib-unflush-misallocation.py jnferguson, 2008-04-08 16:27
fromStringAndSize.c iankko, 2009-04-29 15:06 python-CVE-2008-1887-test
Messages (28)
msg65172 - (view) Author: Justin Ferguson (jnferguson) Date: 2008-04-08 15:49
The PyString_FromStringAndSize() function takes a pointer and signed
integer as input parameters however it fails to adequately check the
sanity of the integer argument. Because of the failure to check for
negative values and because it sums the integer with the size of the
PyStringObject structure it becomes possible for the allocator to take
either of the code paths in PyObject_MALLOC()-- both of which will
incorrectly allocate memory.

This may not seem like a big deal, but I'm posting this instead of
filing a bug for every place this screws you guys over.

if (0 > len || len > PYSSIZE_T_MAX/sizeof(PyStringObject)) 
        return NULL;
msg65181 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-04-08 16:17
This is done already:
the second line in PyString_FromStringAndSize() is
    assert(size>=0);
You have to build python in debug mode though...

Oh, I realize this is not a real patch: no error is raised, and why
PYSSIZE_T_MAX/sizeof(PyStringObject), when the allocation is    
PyObject_MALLOC(sizeof(PyStringObject)+size)?
msg65185 - (view) Author: Justin Ferguson (jnferguson) Date: 2008-04-08 16:23
The problem with assert()'s is they require debugging to be enabled,
obviously, who compiles it that way?

You may not even want to worry about the second check, when you pass it
into the allocator it gets converted to an unsigned int which will cause
the allocator to either fail (32-bit) or allocate more memory than
expected-- either cause it handled/benign.

If you'd prefer I can file every place where this actually bites you
guys, I was just trying to be nice.
msg65187 - (view) Author: Justin Ferguson (jnferguson) Date: 2008-04-08 16:27
Adding a poc from 2586 to demonstrate my point, this causes a call to
the allocator requesting zero bytes.
msg65197 - (view) Author: Justin Ferguson (jnferguson) Date: 2008-04-08 17:06
As an addemdum, consider the following code (theres no assert, but it
wouldnt have helped you outside of debug builds anyways):


488 static PyObject *PySSL_SSLread(PySSLObject *self, PyObject *args)
489 {
490         PyObject *buf;
491         int count = 0;
492         int len = 1024;
[...]
496         if (!PyArg_ParseTuple(args, "|i:read", &len))
497                 return NULL;
498 
499         if (!(buf = PyString_FromStringAndSize((char *) 0, len)))
500                 return NULL;
[...]
521                 count = SSL_read(self->ssl, PyString_AsString(buf),
len);
msg65237 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-09 15:04
I agree that PySSL_SSLread should check that its argument is >= 0.  I 
don't think this check belongs to PyString_FromStringAndSize.   It 
should be the responsibility of the caller to verify that the 
precondition size >= 0 is satisfied before calling PyString_FromStringAndSize.  Oftentimes the caller can ascertain size >= 
0 without an explicit check, for example, if size is the size of a 
buffer or length of a valid string object.

On the other hand, an external input such as the len argument to the 
read function should be checked before used.

I would also suggest changing the len type from int to Py_ssize_t.

I agree with Amaury that assert(size>=0) is sufficient in PyString_FromStringAndSize(), but its documentation should emphasize 
that the caller is responsible for assuring that the requested length is  
nonnegative.

I don't think the upper bound check is necessary: sizeof(PyStringObject) 
+ size will not wrap around as long as size >= 0.
msg65241 - (view) Author: Justin Ferguson (jnferguson) Date: 2008-04-09 17:14
While I think its a mistake that will cause security concerns in Python
for quite some time to come, thats fair enough.

I will refile all the places that this messes up individually
msg65243 - (view) Author: Justin Ferguson (jnferguson) Date: 2008-04-09 17:20
Do I need to create proof of concepts for each of these bugs, or can I
reference this ticket?
msg65250 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-09 18:31
On Wed, Apr 9, 2008 at 1:20 PM, Justin Ferguson <report@bugs.python.org> wrote:
..
>  Do I need to create proof of concepts for each of these bugs, or can I
>  reference this ticket?
>

It would be best if you could provide a patch complete with additional
unit tests that fail in (or crash) unpatched python and demonstrate
the bugs.  Since the fixes are likely to be one-line changes, I would
say there is no need to create separate issues.  Just post a patch
here for а review.

From your other post, I understand that you are doing a security audit
of the python codebase.  Is this a manual effort (identifying unsafe
constructs and searching for them) or you use some kind of automation?
msg65255 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-04-09 18:49
agreed, the assert in PyString_FromStringAndSize() should be changed to
a non-debug test.
msg65260 - (view) Author: Justin Ferguson (jnferguson) Date: 2008-04-09 19:08
Okay, so I'm not sure whose point of view takes precedence here?

Also, to answer the question asked, I'm doing this manually.
msg65261 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-09 19:40
On Wed, Apr 9, 2008 at 3:08 PM, Justin Ferguson <report@bugs.python.org> wrote:

>  Okay, so I'm not sure whose point of view takes precedence here?

I don't have a strong view on this, but just a few points to consider:

1. If you change PyString_FromStringAndSize, you should change
PyBytes_FromStringAndSize and PyUnicode_FromStringAndSize for
consistency.

2. Non-debug check should probably set a ValueError exception and
return NULL.  Think what kind of message you can generate inside
*_FromStringAndSize that would be helpful to the end user.  The caller
is likely to be able to provide a much better diagnostic.

3. Maybe instead of setting a ValueError, a SystemError should be
raised.  In this case the calle will still be clearly responsible for
the pre-condition, but programming errors will not present security
concerns.

4. You have clearly identified several bugs that are not covered by
unittests.  If you fix them en-mass by making
PyString_FromStringAndSize more forgiving, we will miss an opportunity
to improve the test coverage. (If you take the SystemError approach,
this does not apply.)
msg65263 - (view) Author: Justin Ferguson (jnferguson) Date: 2008-04-09 20:01
> 1. If you change PyString_FromStringAndSize, you should change
> PyBytes_FromStringAndSize and PyUnicode_FromStringAndSize for
> consistency.

Okay, if we don't patch the API call itself, I will look at the users of
those two API calls as well, I haven't gone through much of the stuff in
Objects/ yet, I focused on the modules obviously and thus didn't know
about those calls.

> 4. You have clearly identified several bugs that are not covered by
> unittests.  If you fix them en-mass by making
> PyString_FromStringAndSize more forgiving, we will miss an opportunity
> to improve the test coverage. (If you take the SystemError approach,
> this does not apply.)

This is a valid and good point-- I don't mind either way
msg65274 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-04-09 22:13
I think I like the SystemError approach for the same reasons.  It still
exposes the caller's bug but no longer does bad things.
msg65276 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-04-09 22:15
Cool. Let's make this a SystemError (in both debug builds and non-debug
build). Greg, go ahead and fix it.
msg65279 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-04-09 23:24
python trunk r62261 adds the checks and SystemError.

patches to cleanup modules that inadvertently allow negative values to
be passed into *_FromStringAndSize would still be appreciated.
msg65281 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-04-09 23:43
release25-maint r62262

I'm closing this one.  Please open additional issues with patches and/or
pointers to bad callers of *_FromStringAndSize that let negative values
through.

[*sigh* i wish we didn't use a signed size type all over the place]
msg65282 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-04-10 00:18
>  [*sigh* i wish we didn't use a signed size type all over the place]

What would you use for error returns then?
msg65304 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-10 17:52
On Wed, Apr 9, 2008 at 8:18 PM, Guido van Rossum <report@bugs.python.org> wrote:

> >  [*sigh* i wish we didn't use a signed size type all over the place]
>
>  What would you use for error returns then?
>

res == (size_t)-1  or res == 0 && PyErr_Occured()
msg65305 - (view) Author: Justin Ferguson (jnferguson) Date: 2008-04-10 18:41
The use of signed integers in Python is (imho) the single largest threat
to the security of the interpreter. I'm probably preaching to the choir
there though. 

I really dislike have to return values and indicate error in the return
value, its really unclean and causes weirdness, for instance if you
follow PyArg_ParseTuple() down and you have an integer (or maybe it was
long?) argument, you can't actually get a value of -1 because one of the
string->int conversion routines uses that to indicate failure. The check
wrapped around it was something along the lines of:

if (-1 == retval && PyErr_Occurred())

In turn down the line somewhere (I didn't follow the code path), this
value got converted to 1, so for instance doing
__import__('zlib').decompressobj().flush(-1) wouldn't trigger that bug,
it would flush 1 byte, not UINT_MAX as I had expected
msg65306 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-04-10 18:47
Greg, any reason why you kept the assert()?  IMO having the ValueError
even in debug mode is better.
msg65311 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-04-10 19:27
The asserts can go.  I left them in figuring a crashing interpreter on a
debug build in someones sandbox or on a buildbox would get more
attention than a SystemError being raised.  I doubt that is a worthy
assumption on my part.

Both a crash and a SystemError are notable events.

shall I get rid of the asserts?

As for why i dislike signed size types... tons of reasons:
 * It wastes half the range of the integer.
 * It leads to security bugs.
 * on return values -1 and < 0 tests may be convenient to type but they 
   could just as easily compare to a known value defined as a constant; 
   all the things alexander belopolsky suggested.
 * sizes being passed -in- to a function never need to be negative
   meaning safe code requires extra checks like these.
 * sign extension of values going between registers of different sizes
   is needlessly expensive on some cpu architectures.  use unsigned
   types whenever possible for the best code.

anyways, I figure the Python C API is already set in stone using the
signed types so its too late to "fix" it without causing headaches
around the world.
msg65312 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-04-10 19:48
Please do get rid of the asserts.

Regarding signed sizes, it's a historic accident (copied from Unix
system calls which tended to use plain ints or longs in the olden days),
but it may not be impossible to fix -- we've fixed other things
throughout the code (e.g. switching from int to Py_ssize_t).  It'll be a
monumental task though that's probably best tackled post-3.0.
msg65314 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-04-10 19:54
asserts nuked.  r62271 and r62272
msg65315 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-10 20:08
On Thu, Apr 10, 2008 at 3:48 PM, Guido van Rossum
<report@bugs.python.org> wrote:

Maybe as the first step we could get rid of the size sign abuse in
long integer objects.  I would suggest reserving a bit in the first
(or last) digit to carry the sign.
msg66983 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-05-17 07:12
the redundant asserts were removed.  reclosing.

further discussion on signedness issues in the code base should take
place on the mailing list, future bugs or peps as appropriate.
msg85970 - (view) Author: Petr Splichal (psss) Date: 2009-04-14 14:06
Justin, is there any reproducer available for this issue?
Thanks!
msg86807 - (view) Author: Jan Lieskovsky (iankko) Date: 2009-04-29 15:06
Hello guys,

  if I didn't overlook something pretty obvious, this should work
with python-2.6, but it crashes.

Could you please have a look?

Thanks, Jan.
--
Jan iankko Lieskovsky
History
Date User Action Args
2022-04-11 14:56:33adminsetgithub: 46839
2009-04-29 15:06:17iankkosetfiles: + fromStringAndSize.c
nosy: + iankko
messages: + msg86807

2009-04-14 14:06:26pssssetnosy: + psss
messages: + msg85970
2008-05-17 07:12:38gregory.p.smithsetstatus: open -> closed
messages: + msg66983
2008-04-10 20:08:43belopolskysetmessages: + msg65315
2008-04-10 19:54:57gregory.p.smithsetmessages: + msg65314
2008-04-10 19:48:05gvanrossumsetstatus: closed -> open
messages: + msg65312
2008-04-10 19:27:57gregory.p.smithsetmessages: + msg65311
2008-04-10 18:47:38gvanrossumsetmessages: + msg65306
2008-04-10 18:41:34jnfergusonsetmessages: + msg65305
2008-04-10 18:08:18chmod007setnosy: + chmod007
2008-04-10 17:52:25belopolskysetmessages: + msg65304
2008-04-10 00:18:04gvanrossumsetmessages: + msg65282
2008-04-09 23:43:09gregory.p.smithsetstatus: open -> closed
resolution: fixed
messages: + msg65281
versions: - Python 2.5
2008-04-09 23:24:59gregory.p.smithsetmessages: + msg65279
2008-04-09 22:45:29gregory.p.smithsetfiles: - unnamed
2008-04-09 22:45:20gregory.p.smithsetassignee: gregory.p.smith
2008-04-09 22:15:30gvanrossumsetnosy: + gvanrossum
messages: + msg65276
2008-04-09 22:13:05gregory.p.smithsetfiles: + unnamed
messages: + msg65274
2008-04-09 20:01:33jnfergusonsetmessages: + msg65263
2008-04-09 19:40:03belopolskysetmessages: + msg65261
2008-04-09 19:08:13jnfergusonsetmessages: + msg65260
2008-04-09 18:49:17gregory.p.smithsetpriority: high
versions: + Python 2.6
2008-04-09 18:49:02gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg65255
2008-04-09 18:31:07belopolskysetmessages: + msg65250
2008-04-09 17:20:50jnfergusonsetmessages: + msg65243
2008-04-09 17:14:03jnfergusonsetmessages: + msg65241
2008-04-09 15:04:21belopolskysetnosy: + belopolsky
messages: + msg65237
2008-04-08 21:11:22belopolskysettitle: PyString_FromStringAndSize() to be considered unsane -> PyString_FromStringAndSize() to be considered unsafe
2008-04-08 17:06:29jnfergusonsetmessages: + msg65197
2008-04-08 16:27:21jnfergusonsetfiles: + python-2.5.2-zlib-unflush-misallocation.py
messages: + msg65187
2008-04-08 16:23:12jnfergusonsetmessages: + msg65185
2008-04-08 16:17:12amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg65181
2008-04-08 15:49:04jnfergusoncreate