This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Guido
Recipients Guido
Date 2014-10-14.18:35:14
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1413311714.85.0.507321102423.issue22633@psf.upfronthosting.co.za>
In-reply-to
Content
Python/frozenmain.c:27 - https://hg.python.org/cpython/file/424fbf011176/Python/frozenmain.c#l27

Memory is allocated for sizeof(wchar_t*) * argc bytes. If argc is 0 (which is a possibility, see below), then 0 bytes are attempted to allocate.

Note that PyMem_RawMalloc typically calls _PyMem_RawMalloc, which ensures that a nonzero value is passed to malloc: https://hg.python.org/cpython/file/424fbf011176/Objects/obmalloc.c#l60

In the case of argc == 1, we have the guarantee that one byte is allocated.


Then, this: https://hg.python.org/cpython/file/424fbf011176/Python/frozenmain.c#l54 routine fills the argv_copy array with values. However, if argc == 0, this code is never reached.

https://hg.python.org/cpython/file/424fbf011176/Python/frozenmain.c#l71 then sets the program name to argv_copy[0] using Py_SetProgramName().

The issue here is is that because argv_copy[0] may be uninitialized, it may be a nonzero value, because, as far as I know, malloc doesn't give any guarantees as to the initial values of the allocated values (hence the existence of something like calloc).

If a pointer to a zero byte is passed to Py_SetProgramName(), the function doesn't change progname: https://hg.python.org/cpython/file/424fbf011176/Python/pythonrun.c#l884

But since there are no guarantees as to what argv_copy[0] is AND there are no guarantees about the memory region that follows, a rare and unlikely though theoretically possible situation may emerge where each time progname is referenced (for example indirectly by reading to sys.executable), a string is returned that contains bytes after argv_copy[0], resulting in a memory disclosure.

Here's an example of how to run a program with zero arguments (argc == 0):

// https://stackoverflow.com/questions/8113786/executing-a-process-with-argc-0

#include <spawn.h>
#include <stdlib.h>

int main(int argc, char** argv, char** envp)
{
        pid_t pid;
        char* zero_argv[] = {NULL};
        posix_spawn(&pid, "./hello", NULL, NULL, zero_argv, envp);

        int status;
        waitpid(&pid, &status, NULL);
        return 0;
}

I propose the following patch:

--- frozenmain.c	2014-10-14 19:56:27.144705062 +0200
+++ new_frozenmain.c	2014-10-14 19:59:16.800704366 +0200
@@ -24,13 +24,15 @@
     /* We need a second copies, as Python might modify the first one. */
     wchar_t **argv_copy2 = NULL;
 
-    argv_copy = PyMem_RawMalloc(sizeof(wchar_t*) * argc);
+    argv_copy = PyMem_RawMalloc(sizeof(wchar_t*) * (argc ? argc : 1));
     argv_copy2 = PyMem_RawMalloc(sizeof(wchar_t*) * argc);
     if (!argv_copy || !argv_copy2) {
         fprintf(stderr, "out of memory\n");
         goto error;
     }
 
+    argv_copy[0] = '\0';
+
     Py_FrozenFlag = 1; /* Suppress errors from getpath.c */
 
     if ((p = Py_GETENV("PYTHONINSPECT")) && *p != '\0')

By enforcing a minimal allocation of 1 byte in this file, we are guaranteed that malloc doesn't return a non-zero value after it is called with malloc(0) (this is possible, see man malloc) and we don't have to rely on the heap allocator to do this (in case it's not _PyMem_RawMalloc).

Setting argv_copy[0] to zero ensures a buffer overread will never occur.

Tested only for Python 3.4.

Guido
History
Date User Action Args
2014-10-14 18:35:14Guidosetrecipients: + Guido
2014-10-14 18:35:14Guidosetmessageid: <1413311714.85.0.507321102423.issue22633@psf.upfronthosting.co.za>
2014-10-14 18:35:14Guidolinkissue22633 messages
2014-10-14 18:35:14Guidocreate