classification
Title: Fix C aliasing issue in Python/dtoa.c to use strict aliasing on Clang 4.0
Type: performance Stage: resolved
Components: Build Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, dim, eric.smith, haypo, mark.dickinson, serhiy.storchaka
Priority: normal Keywords:

Created on 2017-04-21 09:28 by haypo, last changed 2017-05-02 14:11 by haypo. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 1340 haypo, 2017-04-28 12:46
Messages (25)
msg292018 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 09:28
My change 28205b203a4742c40080b4a2b4b2dcd800716edc added -fno-strict-aliasing on clang to fix the compilation of Python/dtoa.c on clang 4.0. But it's only a temporary workaround until dtoa.c is fixed to respect C99 strict aliasing.

Strict aliasing allows the compiler to enable more optimization, and so should make Python a little bit faster. It would only fix a regression, before my change Python was already build with strict aliasing 

More info about the issue:

* bpo-30104
* https://bugs.llvm.org//show_bug.cgi?id=31928
msg292020 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 09:41
I hope that you understand aliasing issues, because I don't undertand them well :-D So I started to create my collection of links on the topic:
https://haypo-notes.readthedocs.io/misc.html#c-aliasing

The most interesting link is:
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
msg292031 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-21 12:26
dtoa.c *does* (arguably) respect C99+TC3 strict aliasing. The union trick has long been used as a safe and supported way to do this kind of thing, though it appears there's still disagreement about exactly what the standard intends.

I'd strongly prefer not to modify dtoa.c here unless we really have to.
msg292034 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-21 12:31
In case anyone wants to have a go at interpreting the standard, the most immediately relevant part is the footnote to C99+TC3 6.5.2.3p3:

"""
If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.
"""

Trap representations could potentially cause undefined behaviour, but are highly unlikely to occur in practice for us for this particular case, so I don't think they can possibly serve as the excuse for clang's behaviour.

The "reinterpreted as ..." language here strongly suggests that the union trick should be supported.

OTOH, this is just a footnote, so not normative ...
msg292037 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 12:42
> I'd strongly prefer not to modify dtoa.c here unless we really have to.

See the clang bug report:
https://bugs.llvm.org//show_bug.cgi?id=31928
msg292038 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-21 12:46
And the basic rule that bans aliasing in the first place is C99 6.5p7:
"""
An object shall have its stored value accessed only by an lvalue expression that has one of the following types ...
"""
But I'd argue that the "an aggregate or union type" subclause of 6.5p7 covers this case.
msg292039 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-21 12:48
> See the clang bug report:

Yes, I've read it. And I think the Clang folks are wrong in their interpretation of the standard. And even if they're not, they're going to break a lot of code with this change: the union trick has been widely accepted as a valid way to do things.
msg292040 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 12:59
> Yes, I've read it. And I think the Clang folks are wrong in their interpretation of the standard. And even if they're not, they're going to break a lot of code with this change: the union trick has been widely accepted as a valid way to do things.

If Clang decides or not to handle union will decide if Python has or has no to change dtoa.c :-) (I would prefer to keep -fstrict-aliasing).
msg292047 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-21 14:21
> (I would prefer to keep -fstrict-aliasing).

Updating dtoa.c would be a large and error-prone task. It may be simpler to adjust the buildsystem so that we can specify -fno-strict-aliasing just for dtoa.c.
msg292048 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-21 14:23
> Updating dtoa.c would be a large and error-prone task.

It would also take us even further away from the upstream sources, making it harder to integrate bugfixes from upstream.
msg292054 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2017-04-21 15:15
I agree we shouldn't do anything heroic to "fix" dtoa.c. I'd wait and see if Gay (or other maintainers) will chose an approach if Clang keeps this behavior.

At most, I think Mark's idea to use -fno-strict-aliasing only on dtoa.c and nowhere else would be the better approach.
msg292203 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-24 07:51
@haypo: for your strict-aliasing notes collection, I highly recommend the recent paper "Detecting Strict Aliasing Violations" by P. Cuoq et. al.

http://trust-in-soft.com/wp-content/uploads/2017/01/vmcai.pdf
msg292204 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-24 08:17
Could we use Clang specific pragma in dtoa.c rather than a compiler option?
msg292211 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-24 09:03
> Could we use Clang specific pragma in dtoa.c rather than a compiler option?

If we decide to go for the -fno-strict-aliasing only for dtoa.c, I suggest to use it also for GCC. GCC might decide to also optimize dtoa.c further in the future. I don't think that the flag has a major impact on performance if it's restricted to dtoa.c, and it would simplify the build system to only have "per compiler" flags. (Ex: Does ICC also "miscompile" dtoa?)

FreeBSD uses the following syntax to only add the flag on a specific C file. Does it work with GNU and BSD make? (is it a "portable" syntax?)

   CFLAGS.gdtoa_${src}+=-fno-strict-aliasing

See https://svnweb.freebsd.org/changeset/base/313706 (linked from http://bugs.python.org/issue30104#msg292001).
msg292223 - (view) Author: Dimitry Andric (dim) * Date: 2017-04-24 11:25
Note that gcc has documented accessing union members in this way as an "implementation defined" feature:
https://gcc.gnu.org/onlinedocs/gcc/Structures-unions-enumerations-and-bit-fields-implementation.html#Structures-unions-enumerations-and-bit-fields-implementation

They specifically mention an example that is very similar to the dtoa pattern as being allowed under *their* implementation, here:
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Type-punning

However, whether that means that it is generally allowed by all standards is unfortunately still rather vague.  I have seen more than one complaint that the standards committees should make this very explicit, instead of weaseling it into post-release footnotes.

That said, I'd like to reply to some other posters here.

Eric V. Smith writes:
> At most, I think Mark's idea to use -fno-strict-aliasing only on dtoa.c and nowhere else would be the better approach.

Indeed, this is exactly the solution I chose for FreeBSD.  Just the one file that needs it is compiled with -fno-strict-aliasing.  The caveat is that this might not work when link-time optimization is in effect.

Serhiy Storchaka writes:
> Could we use Clang specific pragma in dtoa.c rather than a compiler option?

Unfortunately, as far as I know, clang still does not support function-level optimization pragmas.  Maybe it was implemented recently, but then you would still have to have a workaround for older versions.

STINNER Victor writes:
> If we decide to go for the -fno-strict-aliasing only for dtoa.c, I suggest to use it also for GCC. GCC might decide to also optimize dtoa.c further in the future.

Theoretically they could, but as I pointed out above, they explicitly document this as a feature of their union implementation.  I estimate the probability of them dropping this in the future to be near zero.

> I don't think that the flag has a major impact on performance if it's restricted to dtoa.c, and it would simplify the build system to only have "per compiler" flags. (Ex: Does ICC also "miscompile" dtoa?)

Disabling strict aliasing for just that file, or even just the one function it applies to (by splitting it off to a separate file, for instance) should not result in different assembly output, unless the compiler is very old and/or dumb.

> FreeBSD uses the following syntax to only add the flag on a specific C file. Does it work with GNU and BSD make? (is it a "portable" syntax?)
> 
>    CFLAGS.gdtoa_${src}+=-fno-strict-aliasing

No, this is specifically a feature of BSD's bsd.sys.mk infrastructure.  It's most likely not compatible with GNU make.
msg292224 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2017-04-24 11:34
> Unfortunately, as far as I know, clang still does not support
> function-level optimization pragmas.  Maybe it was implemented
> recently, but then you would still have to have a workaround>
> for older versions.

I realize the answer is probably "no", but I'll ask anyway. Do you know if they support whole-module optimization pragmas? That would still be better than mucking with makefiles to get a per-module flag.
msg292242 - (view) Author: Dimitry Andric (dim) * Date: 2017-04-24 22:45
There is a "#pragma clang optimize", but it only has the options "on" or "off", for specific source locations.  I guess that would defeat the purpose a little bit. :(

As an experiment, and to show what would be needed (at minimum), I have attempted to make Python/dtoa.c completely aliasing-safe here:

https://github.com/DimitryAndric/cpython/commit/29c3f6f5cd771fce5630f127b9e7054593e3160c

This allowed to remove the -fno-strict-aliasing flag again, and it succeeded all tests, even with clang 5.0.0.

It basically replaces the direct union member accesses with getters and setters, which do the right thing for clang.  Note that even though those getters and setters use memcpy(), this is actually completely optimized away in the resulting assembly.  (Old compilers might not fare that well, though.)

In any case, while these are mostly mechanical changes, it is still a lot of code churn, and it should really be reviewed by the original maintainer of dtoa, David M. Gay.  I have no idea whether he is still active, though...
msg292243 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-24 22:55
> As an experiment, and to show what would be needed (at minimum), I have attempted to make Python/dtoa.c completely aliasing-safe here:
> https://github.com/DimitryAndric/cpython/commit/29c3f6f5cd771fce5630f127b9e7054593e3160c

Would it be technically possible to completely get ride of the union? For example, it would allow to replace:

   set_dval(rv, get_dval(rv) + sulp(rv, bc));
with:
   rv += sulp(rv, bc);

Is it possible to replace Big0 and Big1 with Big double?

etc.
msg292280 - (view) Author: Dimitry Andric (dim) * Date: 2017-04-25 21:03
STINNER Victor writes:
> Would it be technically possible to completely get ride of the union? 

Probably, it's just more work, and it has to be done pretty carefully to avoid regressions.  It seems Python already does some exercising of these dtoa functions in its test suite, but ideally you would want to check against upstream's full tests, if those exist.

There are probably many edge cases for a set of tricky functions like this...
msg292281 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-25 21:21
It seems to me that the purpose of using unions was avoiding aliasing issues. But something went wrong.
msg292412 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-27 09:32
> It seems Python already does some exercising of these dtoa functions in its test suite, but ideally you would want to check against upstream's full tests, if those exist.

They barely do: Python's tests for dtoa.c are much more comprehensive than the upstream tests. Or at least they were at the time when I was adapting dtoa.c for use in Python and communicating with Gay about issues, and I doubt that that situation has changed.
msg292415 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-27 10:08
Mark Dickinson: "It would also take us even further away from the upstream sources, making it harder to integrate bugfixes from upstream."

We have two main options:

* Use -fno-strict-aliasing on clang (solution currently used), maybe restrict the option to dtoa.c
* Avoid union to avoid any risk of aliasing issue: option experimented by dim

According to Mark, rewriting the code without union is not only more risky but would also be a major shift from upstream. I disagree with it's so risky, there is a risk yes, but we can take our timeto review it and test it on many platforms with the Python extensive test suite. For example, the aliasing issue on clang 4 was catched quicky on our FreeBSD CURRENT buildbot.

But the current blocker point is upstream: Mark doesn't want diverge from upstream, so Mark: can you (or someone else?) please contact "Gay (or other maintainers)" to take a decision with him/them?
msg292418 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-27 10:43
Victor: I don't think that's necessary. We simply need to add -fno-strict-aliasing for this file.
msg292525 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-28 11:05
I would like to fix FreeBSD CURRENT buildbots of Python 2.7, 3.5 and 3.6, so here my attempt to restrict the -fno-strict-aliasing option to the dtoa.c file: https://github.com/python/cpython/pull/1340

I chose to add the flag for any C compiler. If you think that a C compiler may not support that option, we can start by only adding the option on clang, and maybe add it on other C compilers if someone complains.
msg292761 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-05-02 14:11
dtoa.c is now compiled with -fno-string-aliasing (for any clang version, not only clang 4.0) on Python 3.5, 3.6 and 3.7.

It was decided to not touch dtoa.c to not diverge from upstream.

Thanks Dimitry Andric and Mark Dickinson for your reviews and support in this cryptic issue!

Note 1: We consider that Clang 4.0 is wrong, whereas GCC respects the C99 standard for aliasing on unions. But I don't think that it matters much who is wrong or not :-)

Note 2: Python 2.7 already used -fno-strict-aliasing on all .c files because of its design of PyObject structures, fixed in Python 3.
History
Date User Action Args
2017-05-02 14:11:13hayposetstatus: open -> closed
resolution: fixed
messages: + msg292761

stage: resolved
2017-04-28 12:46:56hayposetpull_requests: + pull_request1454
2017-04-28 11:05:07hayposetmessages: + msg292525
2017-04-27 10:43:33mark.dickinsonsetmessages: + msg292418
2017-04-27 10:08:29hayposetmessages: + msg292415
2017-04-27 09:32:24mark.dickinsonsetmessages: + msg292412
2017-04-25 21:21:02serhiy.storchakasetmessages: + msg292281
2017-04-25 21:03:32dimsetmessages: + msg292280
2017-04-24 22:55:44hayposetmessages: + msg292243
2017-04-24 22:45:32dimsetmessages: + msg292242
2017-04-24 11:34:08eric.smithsetmessages: + msg292224
2017-04-24 11:25:36dimsetnosy: + dim
messages: + msg292223
2017-04-24 09:03:54hayposetmessages: + msg292211
2017-04-24 08:17:33serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg292204
2017-04-24 07:51:36mark.dickinsonsetmessages: + msg292203
2017-04-21 15:15:24eric.smithsetmessages: + msg292054
2017-04-21 14:23:00mark.dickinsonsetmessages: + msg292048
2017-04-21 14:21:42mark.dickinsonsetmessages: + msg292047
2017-04-21 12:59:46hayposetmessages: + msg292040
2017-04-21 12:48:10mark.dickinsonsetmessages: + msg292039
2017-04-21 12:46:54mark.dickinsonsetmessages: + msg292038
2017-04-21 12:42:07hayposetmessages: + msg292037
2017-04-21 12:31:52mark.dickinsonsetnosy: + eric.smith
2017-04-21 12:31:32mark.dickinsonsetmessages: + msg292034
2017-04-21 12:26:13mark.dickinsonsetmessages: + msg292031
2017-04-21 09:41:44hayposetmessages: + msg292020
2017-04-21 09:28:55haypocreate