classification
Title: Strict aliasing violations in Objects/unicodeobject.c
Type: behavior Stage:
Components: Interpreter Core Versions: Python 3.4, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, jcea, mark.dickinson, pconnell, serhiy.storchaka
Priority: normal Keywords:

Created on 2012-09-20 20:16 by mark.dickinson, last changed 2014-08-03 20:41 by BreamoreBoy.

Messages (1)
msg170841 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-09-20 20:16
[Broken out of the discussion in issue 15144]

Some of the newly-optimized code in Objects/unicodeobject.c contains strict aliasing violations;  under the C standards, this is undefined behaviour (C99 6.5p7).

An example occurs in ascii_decode:

    unsigned long value = *(const unsigned long *) _p;

Here the pointer dereference violates the strict aliasing rule.

I think these portions of Objects/unicodeobject.c should be rewritten to avoid the undefined behaviour.

This is not a purely theoretical problem: compilers are known to make optimizations based on the assumption that strict aliasing is not violated.  Early versions of David Gay's dtoa.c gave incorrect results as a result of strict aliasing violations, for example; see [1].

[2] gives a stackoverflow reference explaining strict aliasing.

[1] http://patrakov.blogspot.co.uk/2009/03/dont-use-old-dtoac.html
[2] http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule
History
Date User Action Args
2014-08-03 20:41:40BreamoreBoysetversions: + Python 3.4, Python 3.5, - Python 3.3
2013-04-19 18:46:20pconnellsetnosy: + pconnell
2012-09-28 12:49:25christian.heimessetnosy: + christian.heimes
2012-09-28 11:49:44jceasetnosy: + jcea
2012-09-20 20:16:59mark.dickinsoncreate