msg163473 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-06-22 20:02 |
In unicodeobject.c and stringlib aligned addresses and sizes are used for optimization. pointer->integer and implicit integer->integer conversions may overflow or underflow on platforms with sizeof(size_t) != sizeof(void *) or sizeof(size_t) != sizeof(int). The proposed patch fixes such unsafe things in unicodeobject.c, stringlib and some other files.
There are still a few unsafe places in libffi, but in this library Py_uintptr_t nor uintptr_t are not available.
|
msg164799 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2012-07-07 10:01 |
If we're worrying about undefined behaviour, it looks like recent optimizations have *introduced* new undefined behaviour in the form of strict aliasing violations. E.g., from ascii_decode:
unsigned long value = *(const unsigned long *) _p;
(here _p has type const char *). This should really be fixed; compilers are known to make optimizations based on the assumption that this sort of undefined behaviour doesn't occur.
|
msg164800 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-07-07 10:08 |
> This should really be fixed; compilers are known to make optimizations
> based on the assumption that this sort of undefined behaviour doesn't
> occur.
Doesn't the compiler have all the necessary information here? It's not like a subroutine is called.
(also, I'm not sure what optimization it could actually make)
|
msg164801 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2012-07-07 10:11 |
> (also, I'm not sure what optimization it could actually make)
Me neither, but it doesn't seem safe to assume that no compiler will take advantage of this. I don't want to start guessing what compilers might or might not do; it would be much better simply to stick to valid C where possible.
|
msg164804 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2012-07-07 10:14 |
N.B. This could probably be fixed without affecting performance by using the usual union trick. (IIUC, that trick was technically still undefined behaviour for a while, but was eventually made legal by C99 + TC3.) As far as I know there aren't any instances of compilers causing problems with that construct.
|
msg164805 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-07-07 10:16 |
> N.B. This could probably be fixed without affecting performance by
> using the usual union trick.
How would it work? We would have to add various unions to the
PyUnicode_Object definition?
|
msg164809 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2012-07-07 10:24 |
> How would it work? We would have to add various unions to the
> PyUnicode_Object definition?
No, you'd just need a temporary union defined in unicodeobject.c that would look something like:
typedef union { unsigned long v; char s[SIZEOF_LONG]; } U;
(with better choices of names). Python/dtoa.c does a similar thing to read / write the pieces of a C double using integers safely.
|
msg164810 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2012-07-07 10:26 |
I'll see if I can come up with a patch, and open a new issue for it (since I've successfully derailed this issue from its original topic)
|
msg164818 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-07-07 10:49 |
> If we're worrying about undefined behaviour, it looks like recent optimizations have *introduced* new undefined behaviour in the form of strict aliasing violations. E.g., from ascii_decode:
>
> unsigned long value = *(const unsigned long *) _p;
>
> (here _p has type const char *).
I don't see what the undefined behavior. Can you specify exactly the
item of the Standard, in which it is mentioned?
> This should really be fixed; compilers are known to make optimizations based on the assumption that this sort of undefined behaviour doesn't occur.
I don't know how else you can rewrite it, without destroying completely
the effect of optimization.
In any case, I don't think that the original patch introduces some new
undefined behavior.
|
msg164821 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2012-07-07 11:09 |
> I don't see what the undefined behavior. Can you specify exactly the
> item of the Standard, in which it is mentioned?
It's C99 section 6.5, paragraph 7: "An object shall have its stored value accessed only by an lvalue expression that has one of the following types ...". It's the dereferencing of the pointer that's the problem: that's accessing a stored value of type 'char' by an lvalue expression that has type 'unsigned long'.
> In any case, I don't think that the original patch introduces some new
> undefined behavior.
Ah; were the strict aliasing problems already there before the patch? I didn't check.
|
msg167480 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-08-05 10:39 |
> Ah; were the strict aliasing problems already there before the patch? I didn't check.
Please open separate issue for this.
|
msg170568 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-09-16 18:47 |
Please, Mark or some other C expert, can you do a code review for the patch?
|
msg170620 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-09-17 19:09 |
Perhaps the three new macros should be made available in a .h file?
(in which case they should be private, i.e. start with _).
|
msg170623 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-09-17 20:09 |
> Perhaps the three new macros should be made available in a .h file?
> (in which case they should be private, i.e. start with _).
Perhaps. There are similar macros in other files (Include/objimpl.h,
Objects/obmalloc.c, Python/pyarena.c, Modules/expat/xmlparse.c).
|
msg170684 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2012-09-18 22:27 |
> Perhaps the three new macros should be made available in a .h file?
Good idea. Maybe pymacros.h? These macros need to be documented (with a comment in the .h file)
|
msg170809 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-09-20 12:35 |
Well, here is a new patch. The five new macros moved to pymacros.h and used in more files.
|
msg170811 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2012-09-20 12:52 |
Apologies; I got distracted from the main point of this issue with the strict aliasing stuff, and then it fell off the to-do list. Unassigning; Antoine or Victor, do you want to take this?
|
msg170817 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-09-20 15:14 |
Mark, please open a new discussion, so we don't lose it and that was the place for discussion.
|
msg170828 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-09-20 18:51 |
Thanks for the patch! These macros will be useful.
|
msg170830 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-09-20 18:57 |
New changeset 99112b851b25 by Antoine Pitrou in branch 'default':
Issue #15144: Fix possible integer overflow when handling pointers as integer values, by using Py_uintptr_t instead of size_t.
http://hg.python.org/cpython/rev/99112b851b25
|
msg170831 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-09-20 18:58 |
Committed in 3.3(.1).
|
msg170842 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2012-09-20 20:17 |
> Mark, please open a new discussion.
Done: issue 15992.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:31 | admin | set | github: 59349 |
2012-09-28 11:45:51 | jcea | set | nosy:
+ jcea
|
2012-09-20 20:17:26 | mark.dickinson | set | messages:
+ msg170842 |
2012-09-20 18:58:50 | pitrou | set | status: open -> closed versions:
+ Python 3.3, - Python 3.4 messages:
+ msg170831
resolution: fixed stage: patch review -> resolved |
2012-09-20 18:57:45 | python-dev | set | nosy:
+ python-dev messages:
+ msg170830
|
2012-09-20 18:51:51 | pitrou | set | messages:
+ msg170828 |
2012-09-20 15:14:30 | serhiy.storchaka | set | messages:
+ msg170817 |
2012-09-20 12:52:12 | mark.dickinson | set | assignee: mark.dickinson -> messages:
+ msg170811 |
2012-09-20 12:35:14 | serhiy.storchaka | set | files:
+ align_operations2.patch
messages:
+ msg170809 |
2012-09-18 22:27:41 | vstinner | set | nosy:
+ vstinner messages:
+ msg170684
|
2012-09-17 20:09:53 | serhiy.storchaka | set | messages:
+ msg170623 |
2012-09-17 19:09:26 | pitrou | set | type: security -> behavior messages:
+ msg170620 versions:
+ Python 3.4, - Python 3.3 |
2012-09-16 18:47:02 | serhiy.storchaka | set | messages:
+ msg170568 |
2012-08-05 10:40:15 | serhiy.storchaka | set | keywords:
+ needs review priority: normal -> low |
2012-08-05 10:39:16 | serhiy.storchaka | set | messages:
+ msg167480 stage: patch review |
2012-07-07 11:09:28 | mark.dickinson | set | messages:
+ msg164821 |
2012-07-07 10:49:58 | serhiy.storchaka | set | messages:
+ msg164818 |
2012-07-07 10:26:13 | mark.dickinson | set | messages:
+ msg164810 |
2012-07-07 10:24:48 | mark.dickinson | set | messages:
+ msg164809 |
2012-07-07 10:16:10 | pitrou | set | messages:
+ msg164805 |
2012-07-07 10:14:25 | mark.dickinson | set | messages:
+ msg164804 |
2012-07-07 10:11:33 | mark.dickinson | set | messages:
+ msg164801 |
2012-07-07 10:08:07 | pitrou | set | nosy:
+ pitrou messages:
+ msg164800
|
2012-07-07 10:01:18 | mark.dickinson | set | messages:
+ msg164799 |
2012-06-23 12:05:25 | mark.dickinson | set | assignee: mark.dickinson |
2012-06-22 20:03:22 | mark.dickinson | set | nosy:
+ mark.dickinson
|
2012-06-22 20:02:48 | serhiy.storchaka | create | |