Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyhash's siphash24 assumes alignment of the data pointer #72242

Closed
doko42 opened this issue Sep 9, 2016 · 41 comments
Closed

pyhash's siphash24 assumes alignment of the data pointer #72242

doko42 opened this issue Sep 9, 2016 · 41 comments
Assignees
Labels
3.7 (EOL) end of life 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@doko42
Copy link
Member

doko42 commented Sep 9, 2016

BPO 28055
Nosy @doko42, @pitrou, @vstinner, @tiran, @benjaminp, @ned-deily, @skrah, @serhiy-storchaka, @ztane, @miss-islington, @DerDakon
PRs
  • bpo-28055: fix unaligned accesses in siphash24() #6123
  • [3.7] bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123) #6777
  • [3.6] bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123) #6778
  • Files
  • pyhash.diff
  • pyhash2.diff
  • hash-bytes-alignment.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/doko42'
    closed_at = <Date 2019-04-12.22:36:32.781>
    created_at = <Date 2016-09-09.23:29:02.147>
    labels = ['interpreter-core', '3.7', '3.8', 'type-crash']
    title = "pyhash's siphash24 assumes alignment of the data pointer"
    updated_at = <Date 2019-04-12.22:36:32.780>
    user = 'https://github.com/doko42'

    bugs.python.org fields:

    activity = <Date 2019-04-12.22:36:32.780>
    actor = 'vstinner'
    assignee = 'doko'
    closed = True
    closed_date = <Date 2019-04-12.22:36:32.781>
    closer = 'vstinner'
    components = ['Interpreter Core']
    creation = <Date 2016-09-09.23:29:02.147>
    creator = 'doko'
    dependencies = []
    files = ['44629', '44630', '44648']
    hgrepos = []
    issue_num = 28055
    keywords = ['patch']
    message_count = 41.0
    messages = ['275493', '275500', '275509', '275634', '275761', '276255', '276257', '276258', '276259', '276261', '276263', '276345', '276346', '276347', '276352', '276355', '276374', '276394', '276396', '276397', '276399', '276404', '276406', '276407', '276408', '276409', '276411', '276412', '276495', '286391', '286412', '314472', '315998', '316459', '316461', '316463', '316469', '316470', '322097', '322135', '340125']
    nosy_count = 13.0
    nosy_names = ['doko', 'pitrou', 'vstinner', 'christian.heimes', 'benjamin.peterson', 'ned.deily', 'skrah', 'serhiy.storchaka', 'ztane', 'Jeffrey.Walton', 'gco', 'miss-islington', 'Dakon']
    pr_nums = ['6123', '6777', '6778']
    priority = 'high'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue28055'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8']

    @doko42
    Copy link
    Member Author

    doko42 commented Sep 9, 2016

    pyhash's siphash24 assumes alignment of the data pointer, casting a void pointer (src) to an uint64_t pointer, increasing the required alignment from 1 to 4 bytes. That's invalid code. siphash24 can't assume that the pointer to the data to hash is 4-byte aligned.

    Seen as a bus error trying to run a ARM32 binary on a AArch64 kernel.

    ./python -c 'import datetime; print(hash(datetime.datetime(2015, 1, 1)))'

    the datetime type is defined as

    #define _PyTZINFO_HEAD \
        PyObject_HEAD \
        Py_hash_t hashcode; \
        char hastzinfo; /* boolean flag */
    
    typedef struct
    {
        _PyTZINFO_HEAD
        unsigned char data[_PyDateTime_DATE_DATASIZE];
    } PyDateTime_Date;

    and data is used to calculate the hash of the object, not being 4 byte aligned, you get the bus error. Inserting three fill bytes, are making the data member 4-byte aligned solves the issue, however introducing an ABI change makes the new datetime ABI incompatible, and we don't know about the alignment of objects outside the standard library.

    The solution is to use a memcpy instead of the cast to uint64_t, for now limited to the little endian ARM targets, but I don't see why the memcpy cannot always be used on little endian targets instead of the cast.

    @doko42 doko42 self-assigned this Sep 9, 2016
    @doko42 doko42 added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Sep 9, 2016
    @tiran
    Copy link
    Member

    tiran commented Sep 9, 2016

    Good catch! I had trouble with the data structures in the TZ module before.

    I'm fine with memcpy() on just ARM platforms as a temporary workaround. Let's discuss the issue another time. Right now I'm busy with ssl improvements for 3.6.0b1.

    @benjaminp
    Copy link
    Contributor

    I believe the unaligned memory access configure check is supposed to prevent siphash from being used, so we might look into why that's not working.

    IMO, though, we should just require alignment for the argument to _PyHash_Bytes. It's private after all.

    @doko42
    Copy link
    Member Author

    doko42 commented Sep 10, 2016

    I don't like that configure check, because it depends on the kernel being used at runtime. For many architectures you can define in the kernel if the kernel should allow unaligned accesses or not. Sure this is not an issue for linux distro builds, but might be unexpected for third party builds.

    @ztane
    Copy link
    Mannequin

    ztane mannequin commented Sep 11, 2016

    There is no need to ifdef anything, the memcpy is the only correct way to do it. As memcpy is also a reserved identifier in C, the compiler can and will optimize this into a 64-bit access on those platforms where it can be safely done so (x86 for example), e.g. GCC compiles

        uint64_t func(char *buf) {
            uint64_t rv;
            memcpy(&rv, buf+3, sizeof(rv));
            return rv;
        }

    into

    movq    3(%rdi), %rax
    ret
    

    On Linux 64-bit ABI.

    @doko42
    Copy link
    Member Author

    doko42 commented Sep 13, 2016

    updated patch that always used memcpy for the little endian case.

    @tiran
    Copy link
    Member

    tiran commented Sep 13, 2016

    I'm a bit worried that the patch might slow down the general case of SipHash24. When I was working on SipHash24 I made sure that the general case in PyBytes_Object and PyUnicode_Object are fast and always aligned. Do all compilers optimize that case? For MSVC we still have a specialized Py_MEMCPY() variant in pyports.h.

    I can see three more ways to fix the issue:

    1. Have two loops, one for the aligned case with memcpy() and one for the unaligned case w/o memcpy()
    2. Add a special variant of _le64toh() for PY_LITTLE_ENDIAN on ARM and use the current variant on X86_64.
    3. Make it illegal to call _Py_HashBytes() with non-aligned pointer and require the caller to provide an aligned buffer. It's easy for datetime but requires an extra buffer memoryview. Memoryview already uses a buffer for all but single-strided C contiguous views. We can easily add another case for non-aligned buffers.

    @tiran tiran added the type-crash A hard crash of the interpreter, possibly with a core dump label Sep 13, 2016
    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 13, 2016

    FWIW, MSVC optimizes memcpy:

    http://bugs.python.org/issue15993

    The pgo issue has been fixed according to Steve Dower.

    @doko42
    Copy link
    Member Author

    doko42 commented Sep 13, 2016

    a variant of the patch that keeps the parameter types of _le64toh.

    @doko42
    Copy link
    Member Author

    doko42 commented Sep 13, 2016

    I can check, if the memcpy is optimized away. As an alternative, we could use __builtin_memcpy. That is available for clang as well (would have to check icc).

    @tiran
    Copy link
    Member

    tiran commented Sep 13, 2016

    I created bpo-28126 for MSVC.

    @doko42
    Copy link
    Member Author

    doko42 commented Sep 13, 2016

    I believe the unaligned memory access configure check is supposed to
    prevent siphash from being used, so we might look into why that's not
    working.

    IMO, though, we should just require alignment for the argument to
    _PyHash_Bytes. It's private after all.

    If I understand it correctly, the hash value differs depending on the kernel configuration when the python binary is built, leading to different pickle objects which cannot be shared, making them incompatible . I think the safest thing would be to remove the hash make the selection of the hash method unconditional, and to make this hash function working for all cases.

    @doko42
    Copy link
    Member Author

    doko42 commented Sep 13, 2016

    ... would be to remove the autoconf check and make the selection of the hash method unconditional ...

    @tiran
    Copy link
    Member

    tiran commented Sep 13, 2016

    The main reason for two different hash algorithms was missing support for 64bit integer types. Python 3.4 was targeting platforms that had no 64bit integer support at all (IIRC SPARC). Nowaday Python requires 64bit ints to compile.

    I'm all in favor to remove FVN2 and use SipHash24 on all platforms. Let's deprecated it now and remove it in 3.7.

    @doko42
    Copy link
    Member Author

    doko42 commented Sep 13, 2016

    if the only concern is 32bit sparc, then please let's drop this in 3.6.

    Looking at bpo-28027 the new way to obsoleting things seems to be decreeing them (sorry about the sarcasm). If I interpret your concerns correctly you care about platforms, which you are not supposed to care about. sparc32 doesn't have any use cases now, while ARM32 still has, and will have for some time.

    @serhiy-storchaka
    Copy link
    Member

    IMO, though, we should just require alignment for the argument to _PyHash_Bytes. It's private after all.

    And what to do with memoryview? Memoryview data can be not aligned.

    If I understand it correctly, the hash value differs depending on the kernel configuration when the python binary is built, leading to different pickle objects which cannot be shared, making them incompatible .

    Hash values shouldn't be leaked in pickle.

    @benjaminp
    Copy link
    Contributor

    Here's a patch that requires 8-byte alignment. It almost completely works except that on ABIs with 32-bit pointers, unicode objects can have their data pointers aligned at only 4-bytes. Perhaps we can get away with requiring only 4-byte alignment on 32-bit platforms because they generally have implement the 64-bit load as 2 32-bit loads anyway.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 14, 2016

    For memoryview this is not possible: It is explicitly unaligned and the feature is used in e.g. NumPy.

    @tiran
    Copy link
    Member

    tiran commented Sep 14, 2016

    It's totally possible. Benjamin's patch implements it like I have suggested it.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 14, 2016

    Ah, yes. But compilers optimize memcpy and this is a guaranteed slowdown for the unaligned memoryview case.

    @tiran
    Copy link
    Member

    tiran commented Sep 14, 2016

    How often does NumPy create a C-style, single dimensional, continuous memoryview? I would assume that it deals with matrices, Fortran data and/or other strides, multi-dimensional data in almost all cases.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 14, 2016

    Numpy itself internally doesn't. Consumers of numpy arrays use
    memoryviews. Numpy is often used as a library these days, even
    for simple things like storing a 2-d table, which can easily be
    several TB.

    It is also easy to generate unaligned data by just taking a slice
    of a bytes memoryview.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 14, 2016

    s/unaligned/not 8-byte-aligned/

    @tiran
    Copy link
    Member

    tiran commented Sep 14, 2016

    memoryview() has to create a copy for NumPy memoryviews already.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 14, 2016

    I don't understand this. Could you explain?

    @tiran
    Copy link
    Member

    tiran commented Sep 14, 2016

    memory_hash has to convert buffers unless the buffer is a single-dimensional, C-style and contiguous buffer. A NumPy matrix has more than one dimension, so it must be converted.

    https://hg.python.org/cpython/file/tip/Objects/memoryobject.c#l2854

            if (!MV_C_CONTIGUOUS(self->flags)) {
                mem = PyMem_Malloc(view->len);
                if (mem == NULL) {
                    PyErr_NoMemory();
                    return -1;
                }
                if (buffer_to_contiguous(mem, view, 'C') < 0) {
                    PyMem_Free(mem);
                    return -1;
                }
            }

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 14, 2016

    I see. No, most NumPy arrays are C-contiguous. Multi-dimmensional arrays
    are contiguous, too.

    Non C-contiguous arrays arise mostly during slicing or if they're
    Fortran-order to begin with.

    But NumPy aside, it's weird to have slice of a huge regular bytes view
    (this particular slice is still C-contiguous) that is suddenly copied
    because the alignment requirements changed.

    I really prefer a simple rule for memoryview: If the data is C-contiguous,
    you get the fast path.

    @serhiy-storchaka
    Copy link
    Member

    I support Stefan. Just requiring 8-byte align is the easiest solution, but it doesn't work with memoryview without expensive memory allocation and copying.

    Look at the FNV code. It supports non-4-byte aligned data, and does it in a safe and efficient way.

    @vstinner
    Copy link
    Member

    Christian Heimes added the comment:

    The main reason for two different hash algorithms was missing support for
    64bit integer types. Python 3.4 was targeting platforms that had no 64bit
    integer support at all (IIRC SPARC). Nowaday Python requires 64bit ints to
    compile.

    I'm all in favor to remove FVN2 and use SipHash24 on all platforms. Let's
    deprecated it now and remove it in 3.7.

    Python 3.5.0 doesn't compile if the compiler doesn't support 64 signed
    integer: see pytime.h ;-)

    Are you aware of platforms still using FVN2?

    I'm also in favor of deprecating it. Maybe use #warning in C to log a
    deprecation warning.

    @ned-deily ned-deily added the 3.7 (EOL) end of life label Sep 15, 2016
    @gco
    Copy link
    Mannequin

    gco mannequin commented Jan 28, 2017

    32-bit and 64-bit SPARC ABIs have 64-bit integer data types.

    SPARC, like many RISC architectures, also has natural alignment requirements. Attempting to dereference a pointer to a 4-byte-sized object requires 4-byte alignment, for example. 2-byte-sized objects require 2-byte alignment. 8-byte-sized objects require 8-byte alignment.

    siphash24 is encountering this bug on modern SPARC (32-bit ABI currently, haven't tried compiling as 64-bit yet). The code simply is not portable.

    Benjamin's patch gets the failing self-test (test_plistlib) to pass as well as the simple test case in msg275493 above.

    @pitrou
    Copy link
    Member

    pitrou commented Jan 28, 2017

    I agree with Stefan and Serhiy. Unaligned memoryviews shouldn't trigger a copy when hashing.

    @DerDakon
    Copy link
    Mannequin

    DerDakon mannequin commented Mar 26, 2018

    So, what is the problem with this? Either the compiler knows that unaligned accesses are no problem and optimizes them away anyway, or it is kept because it would crash otherwise. I can confirm that no sparc version >= 3.5 (have not tried older) survives the test suite on Gentoo Sparc (64 bit kernel, 32 bit userspace) without memcpy().

    @ned-deily
    Copy link
    Member

    What's the status of this? It looks like Serhiy has reviewed and approved Dakon's PR 6123. Is everyone OK with merging it? Anything more needed?

    @ned-deily ned-deily added the 3.8 only security fixes label May 1, 2018
    @serhiy-storchaka
    Copy link
    Member

    New changeset 1e2ec8a by Serhiy Storchaka (Rolf Eike Beer) in branch 'master':
    bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123)
    1e2ec8a

    @miss-islington
    Copy link
    Contributor

    New changeset 8ed545f by Miss Islington (bot) in branch '3.7':
    bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123)
    8ed545f

    @miss-islington
    Copy link
    Contributor

    New changeset 0d17e60 by Miss Islington (bot) in branch '3.6':
    bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123)
    0d17e60

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented May 13, 2018

    MSVC optimizes memcpy() to an assignment, sometimes too well (pgo): https://bugs.python.org/issue15993

    But that is fixed long ago, so I also think that the memcpy() approach is best.

    @pitrou
    Copy link
    Member

    pitrou commented May 13, 2018

    Indeed the memcpy() approach is the common idiom in such situations, and sounds like the right thing.

    @JeffreyWalton
    Copy link
    Mannequin

    JeffreyWalton mannequin commented Jul 21, 2018

    I know this is a bit late but I wanted to share...

    OpenCSW has a build farm with Solaris machines and Sparc hardware. The farm provides x86 and Sparc machines with Solaris 9 through 11.

    I believe OpenCSW operates in the same spirit as GCC compile farm. They welcome open source developers and upstream maintainers to help ensure packages build and run on Solaris machines.

    You can read about it at
    https://www.opencsw.org/extend-it/signup/to-upstream-maintainers/ .

    If Python is performing memory access patterns as discussed in the report then it would probably benefit the project to test on a Sparc machine with Solaris 11.

    @vstinner
    Copy link
    Member

    I would say that Python no longer officially supports sparc and solaris
    because of the lack of volunteer.

    @vstinner
    Copy link
    Member

    I see that a fix has been pushed. I'm not sure why this issue is still open, so I close it.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants