Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simple bug in mmap size check #57357

Closed
jazzer mannequin opened this issue Oct 11, 2011 · 8 comments
Closed

simple bug in mmap size check #57357

jazzer mannequin opened this issue Oct 11, 2011 · 8 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@jazzer
Copy link
Mannequin

jazzer mannequin commented Oct 11, 2011

BPO 13148
Nosy @pitrou
Files
  • mmap-greater.patch: patch against python 2.7.2
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2011-10-11.07:45:57.587>
    created_at = <Date 2011-10-11.00:45:47.622>
    labels = ['type-bug', 'library']
    title = 'simple bug in mmap size check'
    updated_at = <Date 2011-10-11.09:10:54.492>
    user = 'https://bugs.python.org/jazzer'

    bugs.python.org fields:

    activity = <Date 2011-10-11.09:10:54.492>
    actor = 'jazzer'
    assignee = 'none'
    closed = True
    closed_date = <Date 2011-10-11.07:45:57.587>
    closer = 'neologix'
    components = ['Library (Lib)']
    creation = <Date 2011-10-11.00:45:47.622>
    creator = 'jazzer'
    dependencies = []
    files = ['23372']
    hgrepos = []
    issue_num = 13148
    keywords = ['patch']
    message_count = 8.0
    messages = ['145319', '145321', '145323', '145324', '145325', '145332', '145334', '145336']
    nosy_count = 3.0
    nosy_names = ['pitrou', 'neologix', 'jazzer']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue13148'
    versions = ['Python 2.7']

    @jazzer
    Copy link
    Mannequin Author

    jazzer mannequin commented Oct 11, 2011

    The condition contradicts the exception text:
                if (offset >= st.st_size) {
                    PyErr_SetString(PyExc_ValueError,
                                    "mmap offset is greater than file size");
                    return NULL;
                }
    The condition should be changed to (offset > st.st_size), similar to the later condition which is correct:
            } else if (offset + (size_t)map_size > st.st_size) {
                PyErr_SetString(PyExc_ValueError,
                                "mmap length is greater than file size");
                return NULL;
            }

    The patch is attached.

    @jazzer jazzer mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Oct 11, 2011
    @pitrou
    Copy link
    Member

    pitrou commented Oct 11, 2011

    Well, you have to understand what this code does: it tries to prevent non-meaningful offsets. If the offset is equal to the file size, mapping from that offset would not map anything in the file (and the system call may actually fail).

    @pitrou pitrou closed this as completed Oct 11, 2011
    @pitrou pitrou added the invalid label Oct 11, 2011
    @jazzer
    Copy link
    Mannequin Author

    jazzer mannequin commented Oct 11, 2011

    First of all, it doesn't fail (at least on Linux), I tested it before posting.

    And the latest, it's not like I'm just stalking around and reading Python sources for fun. It's a real and pretty valid case, I hit it while upgrading our production code to python 2.7.
    I'm using NumPy (linear algebra module) that uses Python's core mmap module under the hood.
    In NumPy, it's pretty valid to have arrays of size 0.
    I have a file with a fixed-size header that holds size of the array and some other meta-data. I mmap this file as a NumPy array using the offset equal to header size. Of course, if there is no data in the array then the file will be just header, and the offset will be equal to the size of the file - here is where this bug hits us as I can't load this file with Python 2.7.2 anymore (while I was able to with Python 2.5).
    This patch fixes this and everything works as expected, giving an array with zero dimensions ('shape' in terms of NumPy):
    >>> x.shape
    (0,)
    >>> x.size
    0

    Please kindly consider applying the patch.

    @jazzer jazzer mannequin reopened this Oct 11, 2011
    @jazzer jazzer mannequin removed the invalid label Oct 11, 2011
    @pitrou
    Copy link
    Member

    pitrou commented Oct 11, 2011

    I don't think it makes sense to accept mmap'ing empty contents when at offset "n" but not at offset "n + 1". Either we remove the check entirely and let people deal with the consequences, or we keep the check as-is.

    @jazzer
    Copy link
    Mannequin Author

    jazzer mannequin commented Oct 11, 2011

    Well, "n+1" is clearly outside the file, wile "n" is within and therefore valid.

    Also, if your position is to forbid zero-size mmapping completely, then the checks inside "if (map_size == 0) {" don't make any sense, especially as they may or may fail.
    From the existing code, zero-size mmapping is OK as long as offset is OK, so the question is whether we consider offset pointing to the end of the file OK. To me, it's fine and valid, and there are valid cases like NumPy's zero-size arrays, hence the proposed patch.

    Removing the check completely is a viable option too, it was already requested for special files:
    http://bugs.python.org/issue12556

    I believe users should have an ability to enjoy whatever their OS provides, and "deal with the consequences" as you said.

    @jazzer
    Copy link
    Mannequin Author

    jazzer mannequin commented Oct 11, 2011

    tried on newer Linux - crashes with my patch.
    So I'll be thinking about a workaround, probably a fix for NumPy to avoid using mmap in such cases but still provide uniform interface to avoid making special conditional handling in all my scripts.
    Therefore, I'm no longer pushing for this change, I will need another workaround anyway.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Oct 11, 2011

    The condition contradicts the exception text:

    Why?
    The offset is zero-based, so 0 <= offset < size is a valid check.

    First of all, it doesn't fail (at least on Linux), I tested it before
    posting.

    Hmmm.
    You got lucky, since the offset must be a multiple of the page size.

    tried on newer Linux - crashes with my patch.

    That's exactly why we perform such checks. Here's what POSIX says:
    """
    [EINVAL]
    The addr argument (if MAP_FIXED was specified) or off is not a multiple of the page size as returned by sysconf(), or are considered invalid by the implementation.
    [ENXIO]
    Addresses in the range [off, off + len) are invalid for the object specified by fildes.
    """

    Since we usually want to avoid implementation-defined behavior (and crashes), I think it's better to stick with the current checks (note that issue bpo-12556 concerned a really corner case: /proc entries supporting mmap but reporting a zero-length).

    Therefore, I'm no longer pushing for this change, I will need another
    workaround anyway.

    Alright, closing then.

    @neologix neologix mannequin closed this as completed Oct 11, 2011
    @jazzer
    Copy link
    Mannequin Author

    jazzer mannequin commented Oct 11, 2011

    You got lucky, since the offset must be a multiple of the page size.
    That's why our header is exactly the page size :)

    Here's what POSIX says
    Then it's just another discrepancy between POSIX and Linux, as I received ENOMEM instead of EINVAL (RHEL6 on 2.6.32).

    Regarding the contradiction, it's probably still worth changing the exception message to "mmap offset is greater than _or equal to_ file size", to match the condition. Just 'greater than' means the '>' check, not the '>=' check from the code, mathematically.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant