Issue24917
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015-08-22 23:39 by JohnLeitch, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
time_strftime_Buffer_Over-read.patch | JohnLeitch, 2015-08-22 23:39 | patch | ||
time_strftime_Buffer_Over-read.py | JohnLeitch, 2015-08-22 23:40 | repro | ||
time_strftime_Buffer_Over-read_v2.patch | JohnLeitch, 2015-09-05 01:29 | patch | ||
time_strftime_Buffer_Over-read_v3.patch | JohnLeitch, 2015-09-05 02:16 | |||
24917_4.patch | steve.dower, 2015-09-06 19:50 |
Messages (67) | |||
---|---|---|---|
msg248998 - (view) | Author: John Leitch (JohnLeitch) * | Date: 2015-08-22 23:39 | |
Python 3.5 suffers from a vulnerability caused by the behavior of the time_strftime() function. When called, the function loops over the format string provided, using strchr to search for each instance of '%'. After finding a '%', it continues to search two characters ahead, assuming that each instance is the beginning of a well formed format string token. However, if a string ends with '%', this logic will result in a call to strchr that reads off the end of the format string buffer: /* check that the format string contains only valid directives */ for(outbuf = strchr(fmt, '%'); <<<< Assuming fmt ends with a '%', this will return a pointer to the end of the string. outbuf != NULL; outbuf = strchr(outbuf+2, '%')) <<<< Once outbuf is pointing to the end of the string, outbuf+2 skips { past the null terimnator, leading to a buffer over-read. if (outbuf[1]=='#') ++outbuf; /* not documented by python, */ if ((outbuf[1] == 'y') && buf.tm_year < 0) { PyErr_SetString(PyExc_ValueError, "format %y requires year >= 1900 on Windows"); Py_DECREF(format); return NULL; } } In some applications, it may be possible to exploit this behavior to disclose the contents of adjacent memory. The buffer over-read can be observed by running the following script: from time import * strftime("AA%"*0x10000) Which, depending on the arrangement of memory, may produce an exception such as this: 0:000> g (20b8.18d4): Access violation - code c0000005 (first chance) First chance exceptions are reported before any exception handling. This exception may be expected and handled. eax=ffffffff ebx=52c1a6a0 ecx=00000000 edx=08ef3000 esi=08ec2fe8 edi=08ec2ff8 eip=52d254f3 esp=004cf9d4 ebp=004cfa58 iopl=0 nv up ei pl nz na pe nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206 python35!strchr+0x33: 52d254f3 f30f6f0a movdqu xmm1,xmmword ptr [edx] ds:002b:08ef3000=???????????????????????????????? 0:000> db edx-0x10 08ef2ff0 41 25 41 41 25 41 41 25-00 d0 d0 d0 d0 d0 d0 d0 A%AA%AA%........ 08ef3000 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3010 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3020 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3030 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3040 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3050 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3060 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 0:000> k5 ChildEBP RetAddr 004cf9d0 52c1a7f6 python35!strchr+0x33 [f:\dd\vctools\crt\vcruntime\src\string\i386\strchr_sse.inc @ 75] 004cfa58 52c832d3 python35!time_strftime+0x156 [c:\build\cpython\modules\timemodule.c @ 615] 004cfa74 52ce442f python35!PyCFunction_Call+0x113 [c:\build\cpython\objects\methodobject.c @ 109] 004cfaa8 52ce18ec python35!call_function+0x2ff [c:\build\cpython\python\ceval.c @ 4651] 004cfb20 52ce339f python35!PyEval_EvalFrameEx+0x232c [c:\build\cpython\python\ceval.c @ 3184] To fix this issue, it is recommended that time_strftime() be updated to check outputbuf[1] for null in the body of the format string directive validation loop. A proposed patch is attached. Credit: John Leitch (johnleitch@outlook.com), Bryce Darling (darlingbryce@gmail.com) |
|||
msg249800 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2015-09-04 18:15 | |
Is there a CERT report associated with this vulnerability? |
|||
msg249801 - (view) | Author: John Leitch (JohnLeitch) * | Date: 2015-09-04 18:18 | |
Currently, no. Would you like us to report this and future vulnerabilities to CERT? |
|||
msg249805 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2015-09-04 18:32 | |
This is happening in windows-only code, so not being a windows user I cannot reproduce it. It does not look like something new. This code was last touched in #10653. Can someone confirm that only 3.5 is affected? |
|||
msg249806 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2015-09-04 18:34 | |
> Would you like us to report this and future vulnerabilities to CERT? If it affects released versions, I think this is the right thing to do. Note that I don't know what PSF policy is on this is or whether they even have one. |
|||
msg249857 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2015-09-04 23:08 | |
I have tried the reproducer on Windows 10 with 2.6, 2.7, 3.3, 3.4, 3.5 and 3.6. In every case I got this. C:\py -3.4 -c"from time import *;strftime('AA%'*0x10000)" Traceback (most recent call last): File "<string>", line 1, in <module> ValueError: Invalid format string |
|||
msg249860 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2015-09-04 23:12 | |
> In every case I got [ValueError] You shouldn't have. I am no Windows expert, but I suspect something is wrong with your use of the command line. Please try it at the Python prompt or put the code in a script. |
|||
msg249861 - (view) | Author: John Leitch (JohnLeitch) * | Date: 2015-09-04 23:23 | |
> I have tried the reproducer on Windows 10 with 2.6, 2.7, 3.3, 3.4, 3.5 and 3.6. In every case I got this. What you are observing is due to the arrangement and contents of process memory. With a simple repro (such as the one provided), there's a good chance the null terminator of the format string will be followed by more null bytes, and thus the code will appear to work as intended. In more complex scripts where memory is ultimately reused, it's more likely that the null terminator will be followed by garbage, non-null bytes. To make the issue reproduce more reliably, use GFlags to enable heap tail checking, heap free checking, and page heap. https://msdn.microsoft.com/en-us/library/windows/hardware/ff549557(v=vs.85).aspx Then, when you repro the issue, you'll see the crash because the uninitialized memory will contain the fill pattern 0xd0 rather than 0x00, like this: 0:000> db edx-0x10 08ef2ff0 41 25 41 41 25 41 41 25-00 d0 d0 d0 d0 d0 d0 d0 A%AA%AA%........ 08ef3000 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3010 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3020 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3030 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3040 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3050 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3060 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? To be clear, heap verification is not a requirement--the bug can indeed be reproduced without it. However, it will make life easier by introducing more determinism. |
|||
msg249862 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2015-09-04 23:26 | |
Python prompt or script makes no difference. C:\Users\Mark\Desktop>py -2.6 time_strftime_Buffer_Over-read.py Traceback (most recent call last): File "time_strftime_Buffer_Over-read.py", line 2, in <module> strftime("AA%"*0x10000) ValueError: Invalid format string Python 3.5.0rc2 (v3.5.0rc2:cc15d736d860, Aug 25 2015, 05:08:37) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from time import * >>> strftime("AA%"*0x10000) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: Invalid format string |
|||
msg249863 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2015-09-04 23:27 | |
Sorry for a bad guess. John's advise makes much more sense than mine. |
|||
msg249864 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2015-09-04 23:34 | |
Okay, so presumably this applies to all supported versions and not just 3.5? Can you also remove the reproducer and supply specific instructions on how to build the code so that the problem can be reproduced. |
|||
msg249867 - (view) | Author: John Leitch (JohnLeitch) * | Date: 2015-09-04 23:47 | |
It very well may apply to versions apart from 3.5. Our test environment is quite complex and unfriendly to working with multiple versions of Python. Plus, we're strapped for time, so we tend to file under the version we're currently targeting and defer to those better equipped and more familiar with the codebase. The repro file is fine and there's no need to rebuild anything; the heap verification I'm suggesting is applied to binaries, not source. You can find information here: https://msdn.microsoft.com/en-us/library/windows/hardware/ff549557(v=vs.85).aspx Install the debugging tools, run GFlags, select the python.exe/pythonw.exe image file, and apply the settings I mentioned in my previous post. |
|||
msg249868 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2015-09-04 23:58 | |
John, There is no doubt that this is a bona fide bug. I was just hoping that someone with a Windows development machine would help figuring out the affected versions. From the hg history, it looks like the faulty code is present in all versions starting from 3.2: https://hg.python.org/cpython/annotate/977c5753ca32/Modules/timemodule.c#l515 I will set the issue versions accordingly. |
|||
msg249870 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2015-09-05 00:03 | |
I will keep the "security" classification, but will not increase the priority. Accepting format strings from untrusted sources is a vulnerability in and by itself and in most cases those strings are literals. |
|||
msg249872 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2015-09-05 00:10 | |
FWIW, the patch looks good to me, but it needs to be reviewed by a Windows developer. |
|||
msg249874 - (view) | Author: John Leitch (JohnLeitch) * | Date: 2015-09-05 00:41 | |
When I get a bit of slackspace (probably tomorrow afternoon/evening) I can test on the spectrum of versions to confirm the issue is in >= 3.2. I'll also look into improving our automation so all future reports can have the appropriate versions flagged. Regarding untrusted format strings, I believe you are mistaken. In native applications, untrusted format strings are problematic because an attacker can use injected tokens to read/write arbitrary memory, which can be leveraged to attain code execution. However, in the context of Python, a format string with too many tokens should be handled safely, resulting in a Python exception rather than exploitable memory corruption. This is the behavior observed in format string handling throughout Python (and indeed most managed/scripting languages). Yes, in most Python programs format strings will be constants, and using dynamically constructed format strings may be considered a bad practice. But, should a developer choose to pass a dynamically constructed string (for example, functionality that allows untrusted users to specify custom time formatting), it's not unreasonable for them to expect memory safety to be maintained. Of course, if there's a risk I'm overlooking I'd like to better understand it, and the relevant Python documentation should be updated. |
|||
msg249876 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2015-09-05 00:57 | |
As far as I know, the practical consequence of "security" classification for the issue is how many affected older versions will be patched. I am keeping that and the 3.2 - 3.5 versions range. The priority may affect whether this will make it into 3.5.0. I'll defer this decision to the release managers. If you want an authoritative answer on the impact of this issue - report it to CERT or FIRST and get them compute the CVSS score. |
|||
msg249877 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-05 01:06 | |
Yes, I'd like this fix in 3.5.0. One bit of feedback on the patch: outbuf is a char * (or wchar_t *), therefore outbuf[1] is a char (or wchar_t). You shouldn't compare it to NULL. I'm not sure we still support any compilers that define NULL as (void *)0, but we might. Anyway, please instead compare to '\0'. |
|||
msg249878 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2015-09-05 01:12 | |
> if there's a risk I'm overlooking I'd like to better understand it, > and the relevant Python documentation should be updated. I don't think there is any special risk that you are overlooking other than a documented fact that Python's strftime is a thin layer on top of system strftime and these are notoriously buggy on many systems. A python application that accepts custom formats from users should limit those formats to a set that is known to work on the targeted platforms. Relying on strftime to properly return an error code and not do anything nasty is probably not a good idea. This said, I express no opinion on the severity of this bug. |
|||
msg249881 - (view) | Author: John Leitch (JohnLeitch) * | Date: 2015-09-05 01:29 | |
Attached is a revised patch. |
|||
msg249883 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2015-09-05 01:49 | |
Hmm, on Mac OSX "%" and "A%" are valid format strings: >>> time.strftime("%") '%' >>> time.strftime("A%") 'A%' Mark's experiments show that on Windows they are not. What about the other platforms affected by the patch? I am concerned about the bottom part of the patch. A nitpick: an existing error message is "Invalid format string". I would either reuse it or make it "Incomplete format string". |
|||
msg249884 - (view) | Author: John Leitch (JohnLeitch) * | Date: 2015-09-05 02:16 | |
I plucked the error message from the % operator: >>> '%' % 'foo' Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: incomplete format >>> '%z' % 'foo' Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: unsupported format character 'z' (0x7a) at index 1 That said, it should be more consistent. I went with the latter suggestion since it's more informative. An updated patch is attached. As for platform compatibility, I'd certainly feel better if someone could test the Sun/AIX changes. Unfortunately, I'm unable to do so. It's interesting that OSX accepts "%". As per the spec, "%%" is a "%" literal, and the behavior of invalid tokens such as "%" is undefined. |
|||
msg249899 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2015-09-05 09:55 | |
On 05.09.2015 03:49, Alexander Belopolsky wrote: > > Alexander Belopolsky added the comment: > > Hmm, on Mac OSX "%" and "A%" are valid format strings: > >>>> time.strftime("%") > '%' >>>> time.strftime("A%") > 'A%' > > Mark's experiments show that on Windows they are not. What about the other platforms affected by the patch? I am concerned about the bottom part of the patch. Trailing '%' are valid on Linux, just as using unsupported format codes (those as passed through as is). On Windows, the C lib strftime() segfaults with a trailing '%', just as it does with unsupported format codes. I have this code in mxDateTime to check for this: static int _mxDateTime_CheckWindowsStrftime(char *fmt, struct tm *tm) { char *p; /* Range checks */ Py_Assert(tm->tm_sec < 60, PyExc_ValueError, ".strftime() cannot format leap seconds on Windows"); Py_Assert(tm->tm_mday <= 31, PyExc_ValueError, ".strftime() cannot format days > 31 on Windows"); /* Scan format string for invalid codes */ for (p = fmt; *p != '\0'; p++) { register char code; if (*p != '%') continue; code = *++p; /* Check for supported format codes; see https://msdn.microsoft.com/en-us/library/fe06s4ak.aspx */ switch (code) { case 'a': case 'A': case 'b': case 'B': case 'c': case 'd': case 'H': case 'I': case 'i': case 'm': case 'M': case 'p': case 'S': case 'U': case 'w': case 'W': case 'x': case 'X': case 'y': case 'Y': case 'z': case 'Z': case '%': continue; case '\0': Py_Error(PyExc_ValueError, "format string may not end with a '%' character " "on Windows"); default: Py_ErrorWithArg(PyExc_ValueError, "format code '%c' not supported on Windows", code); } } return 0; onError: return -1; } |
|||
msg249904 - (view) | Author: Eryk Sun (eryksun) * | Date: 2015-09-05 11:41 | |
> On Windows, the C lib strftime() segfaults with a trailing '%', > just as it does with unsupported format codes. No, it doesn't segfault; it invokes the invalid parameter handler (IPH). This could always be configured for the whole process, but with VC 14 it can also be configured per thread. I think it was Steve Dower who updated time_strftime to take advantage of this feature in 3.5, using the new macros _Py_BEGIN_SUPPRESS_IPH and _Py_END_SUPPRESS_IPH. This allowed removing the following check that used to be in the code prior to 3.5: if (outbuf[1]=='\0' || !strchr("aAbBcdHIjmMpSUwWxXyYzZ%", outbuf[1])) { PyErr_SetString(PyExc_ValueError, "Invalid format string"); Py_DECREF(format); return NULL; } With this change the time module no longer has to make assumptions about what the CRT considers to be a valid format string in order to avoid invoking the IPH. However, this also accidentally removed checking whether outbuf[1]=='\0'. Now, instead of calling the default IPH that terminates the process, Python's _silent_invalid_parameter_handler gets called, which of course does nothing by design: >>> import time >>> time.strftime('A%') Breakpoint 0 hit python35!_silent_invalid_parameter_handler: 00000000`5eadae50 c20000 ret 0 0:000> k8 Child-SP RetAddr Call Site 00000000`002af018 000007fe`e9d8d2ab python35!_silent_invalid_parameter_handler 00000000`002af020 000007fe`e9d8d2c9 ucrtbase!invalid_parameter+0x103 00000000`002af060 000007fe`e9dafedc ucrtbase!invalid_parameter_noinfo+0x9 00000000`002af0a0 000007fe`e9dac359 ucrtbase!Wcsftime_l+0x168 00000000`002af140 000007fe`e9dac3f5 ucrtbase!Strftime_l+0x155 00000000`002af1d0 00000000`5e9fc785 ucrtbase!strftime+0x15 00000000`002af210 00000000`5ea7d5c2 python35!time_strftime+0x1f5 00000000`002af2b0 00000000`5eaf8fd0 python35!PyCFunction_Call+0x122 |
|||
msg249905 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-05 11:51 | |
Rather than debating about how various platforms handle malformed format strings for strftime(), and whether or not they crash, we should simply prevent the native strftime() function from seeing them in the first place. I'd like the "v3" patch in 3.5.0rc3. Can someone create a pull request? |
|||
msg249908 - (view) | Author: Eryk Sun (eryksun) * | Date: 2015-09-05 12:27 | |
> Rather than debating about how various platforms handle malformed > format strings for strftime(), and whether or not they crash, we > should simply prevent the native strftime() function from seeing > them in the first place. Of course the check needs to be restored. I wasn't suggesting to go back to the default IPH to have it terminate the process. That won't help, anyway. The potential over-read is in the for loop, before calling strftime. I just thought the context for the accidental removal was relevant. Care needs to be taken when removing checks that were put in place to avoid the default IPH, because those checks may serve a dual purpose in the surrounding code. The patch also fixes the AIX/Sun code, which needs to be applied to 3.4 as well. I don't think this issue applies to 3.2 and 3.3. |
|||
msg249919 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-05 19:07 | |
I'll do the PR. |
|||
msg249920 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-05 19:18 | |
> Rather than debating about how various platforms handle malformed > format strings for strftime(), and whether or not they crash, we > should simply prevent the native strftime() function from seeing > them in the first place. The crash is nothing to do with the native strftime() function - Python was crashing because it *tried* to prevent malformed strings and we (I) got it wrong. If we were simply passing the string through unchecked this would not have been an issue (or it would have been an issue against the CRT). PR at https://bitbucket.org/larry/cpython350/pull-requests/18/issue-24917-time_strftime-buffer-over-read/diff |
|||
msg249950 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-06 00:13 | |
Pull request accepted. Please forward-merge. Thanks! |
|||
msg249956 - (view) | Author: Roundup Robot (python-dev) | Date: 2015-09-06 04:02 | |
New changeset 09b62202d9b7 by Steve Dower in branch '3.5': Issue #24917: time_strftime() Buffer Over-read. Patch by John Leitch. https://hg.python.org/cpython/rev/09b62202d9b7 New changeset 8b81b7ad2d0a by Steve Dower in branch '3.5': Issue #24917: Moves NEWS entry under Library. https://hg.python.org/cpython/rev/8b81b7ad2d0a New changeset a29b49d57769 by Steve Dower in branch '3.4': Issue #24917: time_strftime() Buffer Over-read. Patch by John Leitch. https://hg.python.org/cpython/rev/a29b49d57769 New changeset 6fc744ac3953 by Steve Dower in branch 'default': Issue #24917: time_strftime() Buffer Over-read. Patch by John Leitch. https://hg.python.org/cpython/rev/6fc744ac3953 |
|||
msg249958 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-06 04:08 | |
I applied the AIX fix to 3.4 (introduced in #19634). Python 3.2 and 3.3 aren't affected. |
|||
msg249960 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-06 04:49 | |
The tests from this patch fail on Linux. ----- First: There is no trailing % test on Linux, and glibc's strftime() happily ignores a trailing %, so no ValueError is raised. Python should do either one or the other of the following: 1) Python should enforce no trailing % in the strftime format string, or 2) the test suite shouldn't assume that a trailing % in the strftime value string raises a ValueError. I can live with either of these, not sure what the right decision is. ----- Second: The test from the patch assumes that strftime('%#') will raise a ValueError. Again, strftime in Linux glibc happily accepts "%#" as a format string and thus no ValueError is raised. Python is agnostic about native format units in the strftime() format string. Therefore I strongly assert that Python must not assume that "%#" is an illegal format string. Therefore the tests must not assume that "%#" raises ValueError. Given that the code used to crash, I do want the code path exercised in the test suite. So I propose that the test attempt time.strftime('%#') and accept either success or ValueError. ----- Given that I've accepted this patch into 3.5.0, and it's now blocking my release, it is implicitly a "release blocker". I need to resolve this tonight before I can tag 3.5.0rc3. I'm going to dinner, and maybe we can have a quick discussion and come to a decision in the next hour or two. p.s. The checkin also flunked PEP 7. *sigh* |
|||
msg249961 - (view) | Author: John Leitch (JohnLeitch) * | Date: 2015-09-06 05:01 | |
Is there a way to see what style guidelines have been violated? The only thing I can think of is the curly braces in the Windows check, but I was following the conventions of the surrounding code. |
|||
msg249962 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-06 05:10 | |
The starting curly brace goes on the same line as the statement starting the block. Keywords followed by a left parenthesis get a space between the keyword and the parenthesis. It's a small matter, I'm really much more interested in reconciling the behavior of strftime and its tests. |
|||
msg249963 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-06 05:24 | |
Sorry, maybe you inherited those violations. I was in a hurry and not in a charitable frame of mind. |
|||
msg249964 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-06 05:26 | |
I'll fix the PEP 7 stuff (mostly inherited). MSVC can also handle '%#' and '%' as format strings (producing '' in both cases). If that matches libc behaviour, I see no reason to raise a ValueError here apart from consistency with previous Python releases. If consistency with system C libraries is preferred, then I'll change the condition to simply break out of the loop and make the test call the function expecting success. I assume there's an AIX buildbot we can watch to see the impact on that part? |
|||
msg249965 - (view) | Author: John Leitch (JohnLeitch) * | Date: 2015-09-06 05:30 | |
Yikes--your comment prompted me to look at the check-in, and it seems my patch wasn't properly applied. The curly braces got tweaked, which is minor as you stated, but more importantly the AIX code should not decref format. That could introduce problems bigger than what this patch was attempting to fix. And, not to dwell, but where do you see a keyword immediately followed by a left parens? I want to make sure everything is properly polished in the future, and the only thing I see is the untouched "for". Regarding your initial concerns: 1) I think we should enforce no trailing % so as to not pass format strings that may cause undefined behavior. 2) How about expecting ValueError on Windows/AIX, and pass on all other platforms? |
|||
msg249966 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-06 05:34 | |
Whoops, must have done a bad copy-paste to get that DECREF in there (I couldn't apply the patch directly because it didn't come from an HG repo, so I had to do it by hand). MSVC seems somewhat inconsistent about its response: >>> strftime('aaa%') '' >>> strftime('aaaa%') Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: Invalid format string How closely do we want to simply expose the underlying C library's behaviour? |
|||
msg249967 - (view) | Author: John Leitch (JohnLeitch) * | Date: 2015-09-06 05:49 | |
If it's so wildly inconsistent, it's my opinion that Python should perform its own validation to achieve better cross-platform support. The alternative is playing a never ending game of whack-a-mole, or just accepting that format strings may cause exceptions in some platforms but not others. |
|||
msg249969 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-06 06:04 | |
Having now read over this whole issue, I don't actually see where the security vulnerability is (on Windows at least). This is a user-mode read, so it can only access memory in the same process, and it doesn't display it anywhere. The worst that can happen is that it hits an unreadable page and crashes, which falls under "undefined behaviour due to invalid input". I think we should just revert the patch completely. |
|||
msg249970 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-06 06:05 | |
I have convinced myself that disallowing trailing % marks on all platforms is the right thing to do. Whether or not the underlying platform tolerates it, it is never *valid* input to strftime. I have a patch incubating at home. I'll put it up for review when I get back, maybe an hour. |
|||
msg249971 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-06 06:08 | |
I'll get my commits out of the way to make the buildbots green again. |
|||
msg249972 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-06 06:10 | |
Oh, maybe it was all like that before. Sorry, I was in a hurry and not in a charitable frame of mind. |
|||
msg249973 - (view) | Author: Roundup Robot (python-dev) | Date: 2015-09-06 06:14 | |
New changeset 73227e857d6b by Steve Dower in branch '3.5': Issue #24917: Backed out changeset 09b62202d9b7 https://hg.python.org/cpython/rev/73227e857d6b New changeset 743924064dc8 by Steve Dower in branch 'default': Issue #24917: Backed out changeset 09b62202d9b7 https://hg.python.org/cpython/rev/743924064dc8 |
|||
msg249974 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-06 06:18 | |
Yeah, I came in late and in too much of a rush too :( Backed out the broken changes, and now I'm off to bed. Not convinced there's anything that needs fixing here, and IIRC we argued through the different platform behaviours on the issue when I removed the null check that was preventing the overread. If we do anything now, it should just be `if (outbuf[1] == '\0') break` to avoid the overread. strftime() is probably too far gone to make it consistent across platforms now. |
|||
msg249975 - (view) | Author: John Leitch (JohnLeitch) * | Date: 2015-09-06 06:34 | |
Yes, this is a user-mode read, but I disagree with the assertion that it's not possible to use this to disclose memory. While it isn't as critical as something that outright dumps memory, there is logic that throws exceptions based on values encountered while reading outside the bounds of the buffer. This could be used as a channel to infer what is or isn't in adjacent memory. That it's user-mode doesn't matter--if an application exposes the format string as attack surface, suddenly process memory can be probed. So, it's not heartbleed, but it does have security implications. If you'd like, I can take a shot at building a PoC. Further, it's best to err on the side of caution with bugs like these; just because it doesn't seem like major issue now doesn't mean someone won't come along in the future and prove otherwise. |
|||
msg249976 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-06 06:54 | |
I'm going to back this out of 3.5.0rc3. I'm still willing to discuss accepting it into 3.5.0 final in some form, but for now I don't want to hold up the release. Steve: it should never be possible to crash the Python interpreter with well-formed Python code. Whether or not it's possible to exploit it, I do think this crash is something the interpreter should prevent. |
|||
msg249977 - (view) | Author: Eryk Sun (eryksun) * | Date: 2015-09-06 06:57 | |
> MSVC seems somewhat inconsistent about its response: > >>> strftime('aaa%') > '' That's not due to MSVC. It's setting errno to EINVAL. The problem is that time_strftime is testing (buflen > 0 || i >= 256 * fmtlen). The initial value of the outbuf size i is 1024, so when (fmtlen <= 4), the value of (256 * fmtlen) is less than or equal to i, and it assumes "the format yields an empty result" without considering the value of errno. So instead of raising an exception for EINVAL, it calls PyUnicode_DecodeLocaleAndSize to return an empty string: >>> strftime('aaa%') Breakpoint 1 hit ucrtbase!strftime: 000007fe`e2bac3e0 4883ec38 sub rsp,38h 0:000> gu python35!time_strftime+0x1f5: 00000000`5de9c785 488bcb mov rcx,rbx 0:000> be 0; g Breakpoint 0 hit ucrtbase!_errno: 000007fe`e2b341b0 48895c2408 mov qword ptr [rsp+8],rbx ss:00000000`0028f320=ffffffffffffffff errno is 22: 0:000> pt; dd @rax l1 00000000`002ddb50 00000016 0:000> bd 0; g Breakpoint 2 hit python35!PyUnicode_DecodeLocaleAndSize: 00000000`5df55070 4053 push rbx 0:000> k 2 Child-SP RetAddr Call Site 00000000`0028f318 00000000`5de9c81a python35!PyUnicode_DecodeLocaleAndSize 00000000`0028f320 00000000`5df1d5c2 python35!time_strftime+0x28a 0:000> g '' |
|||
msg249994 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-06 13:54 | |
Larry: agreed, no crashing. We'll get the trivial fix in that doesn't raise any error, and should probably stop suppressing the exception in the situation eryksun describes. That will correctly expose the platform's strftime semantics. Unless you want to write more thorough validation of incoming strings, I'll work something up later today or tomorrow. |
|||
msg250014 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-06 19:50 | |
New patch attached that just breaks out of the scanning loop and lets the system CRT handle invalid format strings. Fixes the condition that was suppressing some errors on Windows. Also fixed the PEP 7 issues around the changed code (I believe). No new test because the crash cannot be reliably reproduced and the expected results vary dramatically between platforms. The two if/break lines in the AIX branch should also be applied to 3.4 to prevent crashes. The rest of the fix (including the PEP 7 fixes) should not. |
|||
msg250020 - (view) | Author: John Leitch (JohnLeitch) * | Date: 2015-09-06 20:59 | |
First, let me begin by saying I believe this patch will fix the buffer over-read, which is a good step forward. However, after giving the matter more thought, and at the risk of wearing out my welcome, I am of the belief that relying on the CRT to handle malformed format strings is the wrong approach. As per the C spec, strftime's behavior when handling invalid format strings is undefined: "If a conversion specifier is not one of the above, the behavior is undefined" Quite often, "undefined" translates to "exploitable". And at the very least, by not performing thorough enough validation, Python is misusing strftime(), which may lead to crashes or undermine memory safety. Of course, this is all speculation--I haven't the time or resource to learn other platforms to see what's possible. But, even if I could, the task would be Sisyphean because there's simply no way to know what the future holds when dealing with implementation that could change at any point. I realize we must be pragmatic with matters such as this, and a dramatic change could be breaking for some Python apps. Even so, I feel it's worth vocalizing these concerns. As a principal, I think that "safe", well-formed Python should never be able to perform operations that lead to undefined behavior in the underlying runtime. Alright, rant done. If at any point in time locking down Python's strftime with more aggressive validation is considered viable, I am more than willing to take a shot at submitting a patch. |
|||
msg250022 - (view) | Author: Alexander Belopolsky (belopolsky) * | Date: 2015-09-06 21:19 | |
Historically, time and os modules have been considered low level, thin wrappers over system libc functions. Users of these modules are the proverbial "consenting adults" who should understand their power and the associated risks. The place to provide a better behaved strftime function is the datetime module, but I would leave time.stftime pretty much the way it is and would only consider making *fewer* checks on Windows if _Py_BEGIN_SUPPRESS_IPH / _Py_END_SUPPRESS_IPH macros can indeed help to avoid crashes on illegal format strings. |
|||
msg250023 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-06 21:55 | |
FWIW, those macros prevent a certain class of "undefined behavior," which in this case is to terminate the process rather than return an error code. They don't do anything to prevent crashes due to exploitation or malicious code. The place for more robust formatting is datetime.__format__(), which from a quick test seems to be more strict about what is in the format string. |
|||
msg250024 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-06 22:04 | |
Having slept on it, I agree with Steve. We should make the minimal change necessary in order to not crash. However, it still needs a regression test. The test can use JohnLeitch's proposed test as a good starting point, but it must accept either success or ValueError as passing the test. I still think there's merit in detecting a trailing unquoted % at the end of the format string, but that would only be appropriate for 3.6 at this point. |
|||
msg250030 - (view) | Author: Eryk Sun (eryksun) * | Date: 2015-09-06 22:45 | |
With MSVC, if errno is cleared before calling strftime, then when buflen == 0 you know whether it's an invalid format string (EINVAL) or maxsize is too small (ERANGE). There's no need to guess. Oddly, only EINVAL is documented, even though setting ERANGE has been in the implementation for years. VC 10 (strftime.c): /* error - return an empty string */ *(strstart)='\0'; /* now return our error/insufficient buffer indication */ if ( !failed && left <= 0 ) { /* do not report this as an error to allow the caller to resize */ errno=ERANGE; } else { _VALIDATE_RETURN( FALSE, EINVAL, 0); } VC 14 (time/wcsftime.cpp): // Error: return an empty string: *string = L'\0'; // Now return our error/insufficient buffer indication: if (!failed && remaining <= 0) { // Do not report this as an error to allow the caller to resize: errno = ERANGE; } else { _VALIDATE_RETURN(false, EINVAL, 0); } |
|||
msg250031 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-06 23:14 | |
How's this for the test (I added an explanatory comment, since it may look like it isn't testing anything to someone who isn't familiar with the issue): def test_strftime_format_check(self): # Test that strftime does not crash on invalid format strings # that may trigger a buffer overread. When not triggered, # strftime may succeed or raise ValueError depending on # the platform. for x in [ '', 'A', '%A', '%AA' ]: for y in range(0x0, 0x10): for z in [ '%', 'A%', 'AA%', '%A%', 'A%A%', '%#' ]: try: time.strftime(x * y + z) except ValueError: pass |
|||
msg250035 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-07 02:24 | |
New PR: https://bitbucket.org/larry/cpython350/pull-requests/19/issue-24917-time_strftime-buffer-over-read |
|||
msg250039 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-07 02:55 | |
Steve, did you confirm that the test triggers the array bounds bug when the patch *isn't* applied? I want to confirm both that a) the test exercises the bug, and b) the fix fixes the bug. I assume you ran the test suite with the patch applied, so that covers b). |
|||
msg250040 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-07 03:06 | |
I wasn't able to repro the crash at all, even with the debugging flags that make this sort of issue more prominent. It relies on a very precise layout of multiple objects in memory, or possibly a specific sequence of allocations/deallocations, as well as a format string ending in an unescaped '%' or (on Windows) '%#'. It's still obviously an issue though - we should have the check for '\0' there by any reasonably analysis of the code, or else should not be adding 2 to the pointer to start the next step of the search. |
|||
msg250041 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-07 03:26 | |
Okay, I think *I* reproduced it. 1) I pulled your cpython350 fork down locally. 2) I updated to your checkin that fixed the bug. (c31dad22c80d) 3) I reverted the change to Modules/timemodule.c to put the bug back: % hg cat -r 97393 Modules/timemodule.c > Modules/timemodule.c 4) I changed line 611 (or so) from "#if defined(MS_WINDOWS) && !defined(HAVE_WCSFTIME)" to "#if 1" so I'd get the code that had the bug. 5) I ran "./configure --with-valgrind && make" to make it. 6) I ran "valgrind ./python -m test test_time" and ***Valgrind complained about an array overrun***. 7) I restored the bugfix to Modules/timemodule.c, then reinstated the change from 4) above. 8) I ran make and valgrind again and didn't get the complaint about the array overrun. For grins I also tried enabling the other stanza of code that has the bug (the AIX / sun / have_wcsftime) and observed the same thing. Is that convincing? |
|||
msg250042 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-07 03:36 | |
Good enough for me. |
|||
msg250062 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-07 05:12 | |
Backout pull request merged, please forward-merge, thanks! |
|||
msg250064 - (view) | Author: Larry Hastings (larry) * | Date: 2015-09-07 05:13 | |
(I meant, just normal pull request. I did your two pull requests right in a row and got my wires crossed.) |
|||
msg250071 - (view) | Author: Roundup Robot (python-dev) | Date: 2015-09-07 05:38 | |
New changeset c31dad22c80d by Steve Dower in branch '3.5': Issue #24917: time_strftime() buffer over-read. https://hg.python.org/cpython/rev/c31dad22c80d New changeset f185917498ca by Steve Dower in branch '3.4': Issue #24917: time_strftime() buffer over-read. https://hg.python.org/cpython/rev/f185917498ca |
|||
msg250202 - (view) | Author: STINNER Victor (vstinner) * | Date: 2015-09-08 13:49 | |
The change c31dad22c80d introduced a regression: issue #25029. |
|||
msg250203 - (view) | Author: STINNER Victor (vstinner) * | Date: 2015-09-08 13:52 | |
Oh, tracemalloc sees the memory peak of 8 GB too: $ ./python -X tracemalloc -i -m test -v test_strptime ... SystemExit: True >>> import tracemalloc; tracemalloc.get_traced_memory()[1] / 1024.**2 8201.658247947693 Memory peak: 8201.7 MB (8.2 GB!). |
|||
msg250213 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-09-08 15:15 | |
I'll fix it on #25029. This thread is already too long for my liking. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:20 | admin | set | github: 69105 |
2015-09-08 15:15:28 | steve.dower | set | status: open -> closed superseder: MemoryError in test_strptime resolution: fixed messages: + msg250213 |
2015-09-08 13:52:21 | vstinner | set | messages: + msg250203 |
2015-09-08 13:49:20 | vstinner | set | status: closed -> open resolution: fixed -> (no value) messages: + msg250202 |
2015-09-07 05:38:04 | python-dev | set | messages: + msg250071 |
2015-09-07 05:34:07 | larry | set | status: open -> closed resolution: fixed stage: commit review -> resolved |
2015-09-07 05:13:10 | larry | set | messages: + msg250064 |
2015-09-07 05:12:00 | larry | set | messages: + msg250062 |
2015-09-07 03:36:33 | steve.dower | set | messages: + msg250042 |
2015-09-07 03:26:12 | larry | set | messages: + msg250041 |
2015-09-07 03:06:03 | steve.dower | set | messages: + msg250040 |
2015-09-07 02:55:38 | larry | set | messages: + msg250039 |
2015-09-07 02:24:33 | steve.dower | set | messages: + msg250035 |
2015-09-06 23:14:22 | steve.dower | set | messages: + msg250031 |
2015-09-06 22:45:38 | eryksun | set | messages: + msg250030 |
2015-09-06 22:04:04 | larry | set | messages: + msg250024 |
2015-09-06 21:55:24 | steve.dower | set | messages: + msg250023 |
2015-09-06 21:19:11 | belopolsky | set | messages: + msg250022 |
2015-09-06 20:59:26 | JohnLeitch | set | messages: + msg250020 |
2015-09-06 19:50:41 | steve.dower | set | files:
+ 24917_4.patch messages: + msg250014 versions: - Python 3.2, Python 3.3 |
2015-09-06 13:54:41 | steve.dower | set | messages: + msg249994 |
2015-09-06 07:52:32 | larry | set | priority: release blocker -> deferred blocker |
2015-09-06 06:57:19 | eryksun | set | messages: + msg249977 |
2015-09-06 06:54:28 | larry | set | messages: + msg249976 |
2015-09-06 06:34:54 | JohnLeitch | set | messages: + msg249975 |
2015-09-06 06:18:38 | steve.dower | set | messages: + msg249974 |
2015-09-06 06:14:45 | python-dev | set | messages: + msg249973 |
2015-09-06 06:10:27 | larry | set | messages: + msg249972 |
2015-09-06 06:08:36 | steve.dower | set | messages: + msg249971 |
2015-09-06 06:05:40 | larry | set | messages: + msg249970 |
2015-09-06 06:04:11 | steve.dower | set | messages: + msg249969 |
2015-09-06 05:49:35 | JohnLeitch | set | messages: + msg249967 |
2015-09-06 05:34:36 | steve.dower | set | messages: + msg249966 |
2015-09-06 05:30:57 | JohnLeitch | set | messages: + msg249965 |
2015-09-06 05:26:00 | steve.dower | set | status: closed -> open priority: normal -> release blocker nosy: + georg.brandl messages: + msg249964 resolution: fixed -> (no value) |
2015-09-06 05:24:48 | larry | set | messages: + msg249963 |
2015-09-06 05:10:29 | larry | set | messages: + msg249962 |
2015-09-06 05:01:34 | JohnLeitch | set | messages: + msg249961 |
2015-09-06 04:49:19 | larry | set | messages: + msg249960 |
2015-09-06 04:08:38 | steve.dower | set | status: open -> closed messages: + msg249958 assignee: steve.dower resolution: fixed stage: patch review -> commit review |
2015-09-06 04:02:09 | python-dev | set | nosy:
+ python-dev messages: + msg249956 |
2015-09-06 00:13:43 | larry | set | messages: + msg249950 |
2015-09-05 19:18:07 | steve.dower | set | messages: + msg249920 |
2015-09-05 19:07:43 | steve.dower | set | messages: + msg249919 |
2015-09-05 12:27:49 | eryksun | set | messages: + msg249908 |
2015-09-05 11:51:21 | larry | set | messages: + msg249905 |
2015-09-05 11:41:17 | eryksun | set | nosy:
+ eryksun messages: + msg249904 |
2015-09-05 09:55:32 | lemburg | set | messages: + msg249899 |
2015-09-05 02:16:17 | JohnLeitch | set | files:
+ time_strftime_Buffer_Over-read_v3.patch messages: + msg249884 |
2015-09-05 01:49:29 | belopolsky | set | messages: + msg249883 |
2015-09-05 01:29:11 | JohnLeitch | set | files:
+ time_strftime_Buffer_Over-read_v2.patch messages: + msg249881 |
2015-09-05 01:12:56 | belopolsky | set | messages: + msg249878 |
2015-09-05 01:06:03 | larry | set | messages: + msg249877 |
2015-09-05 00:57:47 | belopolsky | set | nosy:
+ larry messages: + msg249876 |
2015-09-05 00:41:34 | JohnLeitch | set | messages: + msg249874 |
2015-09-05 00:10:11 | belopolsky | set | messages:
+ msg249872 components: + Extension Modules |
2015-09-05 00:03:56 | belopolsky | set | messages: + msg249870 |
2015-09-04 23:58:16 | belopolsky | set | messages:
+ msg249868 versions: + Python 3.2, Python 3.3, Python 3.4 |
2015-09-04 23:47:22 | JohnLeitch | set | messages: + msg249867 |
2015-09-04 23:34:11 | BreamoreBoy | set | nosy:
+ steve.dower, paul.moore, tim.golden, zach.ware messages: + msg249864 components: + Windows |
2015-09-04 23:27:19 | belopolsky | set | messages: + msg249863 |
2015-09-04 23:26:20 | BreamoreBoy | set | messages: + msg249862 |
2015-09-04 23:23:29 | JohnLeitch | set | messages: + msg249861 |
2015-09-04 23:12:49 | belopolsky | set | messages: + msg249860 |
2015-09-04 23:08:04 | BreamoreBoy | set | nosy:
+ BreamoreBoy messages: + msg249857 |
2015-09-04 18:34:06 | belopolsky | set | messages: + msg249806 |
2015-09-04 18:32:02 | belopolsky | set | nosy:
+ vstinner messages: + msg249805 |
2015-09-04 18:18:29 | JohnLeitch | set | messages: + msg249801 |
2015-09-04 18:18:13 | belopolsky | set | stage: patch review |
2015-09-04 18:15:44 | belopolsky | set | messages: + msg249800 |
2015-09-04 17:43:26 | JohnLeitch | set | nosy:
+ lemburg, belopolsky |
2015-08-23 22:52:36 | brycedarling | set | nosy:
+ brycedarling |
2015-08-22 23:40:44 | JohnLeitch | set | files: + time_strftime_Buffer_Over-read.py |
2015-08-22 23:39:19 | JohnLeitch | create |