Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[security] CVE-2017-1000158: Unsafe arithmetic in PyString_DecodeEscape #74842

Closed
jaybosamiya mannequin opened this issue Jun 13, 2017 · 17 comments
Closed

[security] CVE-2017-1000158: Unsafe arithmetic in PyString_DecodeEscape #74842

jaybosamiya mannequin opened this issue Jun 13, 2017 · 17 comments
Assignees
Labels
easy interpreter-core (Objects, Python, Grammar, and Parser dirs) type-security A security issue

Comments

@jaybosamiya
Copy link
Mannequin

jaybosamiya mannequin commented Jun 13, 2017

BPO 30657
Nosy @vstinner, @larryhastings, @ned-deily, @serhiy-storchaka, @jaybosamiya, @kirotawa
PRs
  • [2.7] bpo-30657: Check & prevent integer overflow in PyString_DecodeEscape #2174
  • [3.5] bpo-30657: Check & prevent integer overflow in PyString_DecodeEscape (GH-2174) #4664
  • [3.4] bpo-30657: Check & prevent integer overflow in PyString_DecodeEscape (GH-2174) #4758
  • Files
  • poc-gen.py: Generates poc.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2017-12-08.21:36:52.023>
    created_at = <Date 2017-06-13.15:35:29.340>
    labels = ['type-security', 'interpreter-core', 'easy']
    title = '[security] CVE-2017-1000158: Unsafe arithmetic in PyString_DecodeEscape'
    updated_at = <Date 2019-05-10.18:20:34.287>
    user = 'https://github.com/jaybosamiya'

    bugs.python.org fields:

    activity = <Date 2019-05-10.18:20:34.287>
    actor = 'ned.deily'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2017-12-08.21:36:52.023>
    closer = 'larry'
    components = ['Interpreter Core']
    creation = <Date 2017-06-13.15:35:29.340>
    creator = 'jaybosamiya'
    dependencies = []
    files = ['46950']
    hgrepos = []
    issue_num = 30657
    keywords = ['patch', 'easy (C)']
    message_count = 17.0
    messages = ['295930', '295947', '295957', '296277', '306823', '306826', '306875', '306890', '307138', '307238', '307240', '307243', '307245', '307246', '307868', '307869', '307870']
    nosy_count = 6.0
    nosy_names = ['vstinner', 'larry', 'ned.deily', 'serhiy.storchaka', 'jaybosamiya', 'leosilva']
    pr_nums = ['2174', '4664', '4758']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue30657'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @jaybosamiya
    Copy link
    Mannequin Author

    jaybosamiya mannequin commented Jun 13, 2017

    In Python 2.7, there is a possible integer overflow in
    PyString_DecodeEscape function of the file stringobject.c, which can
    be abused to gain a heap overflow, possibly leading to arbitrary code
    execution.

    The relevant parts of the code are highlighted below:

        PyObject *PyString_DecodeEscape(const char *s,
                                        Py_ssize_t len,
                                        const char *errors,
                                        Py_ssize_t unicode,
                                        const char *recode_encoding)
        {
            int c;
            char *p, *buf;
            const char *end;
            PyObject *v;
    (1)     Py_ssize_t newlen = recode_encoding ? 4*len:len;
    (2)     v = PyString_FromStringAndSize((char *)NULL, newlen);
            if (v == NULL)
                return NULL;
    (3)     p = buf = PyString_AsString(v);
            end = s + len;
            while (s < end) {
                if (*s != '\\') {
                  non_esc:
        #ifdef Py_USING_UNICODE
        [...]
        #else
                    *p++ = *s++;
        #endif
                    continue;
        [...]
                }
            }
    (4)     if (p-buf < newlen)
                _PyString_Resize(&v, p - buf); /* v is cleared on error */
            return v;
          failed:
            Py_DECREF(v);
            return NULL;
        }

    (1) If recode_encoding is true (i.e., non-null), we have an integer
    overflow here which can set newlen to be some very small value
    (2) This allows a small string to be created into v
    (3) Now p (and buf) use that small string
    (4) The small string is copied into with a larger string, thereby
    giving a heap buffer overflow

    In the highly unlikely but definitely possible situation that we pass
    it a very large string (in the order of ~1GB on a 32-bit Python
    install), one can reliably get heap corruption. It is possible to
    access this function (and condition in line(1)) through function
    parsestr from ast.c, when the file encoding of an input .py file is
    something apart from utf-8 and iso-8859-1. This can be trivially done
    using the following at the start of the file:
    # -- coding: us-ascii --

    The attached file (poc-gen.py) produces a poc.py file which satisfies
    these constraints and shows the vulnerability.

    Note: To see the vulnerability in action, it is necessary to have an
    ASAN build of Python, compiled for 32 bit on a 64 bit machine.
    Additionally, the poc.py file generated can take an extremely long
    time to load (over a few hours), and finally crash. Instead, if one
    wishes to see the proof of vulnerability quicker, then it might be
    better to change the constant 4 in line (1) to 65536 (just for
    simplicity sake), and change the multiplication_constant in poc-gen.py
    file to be the same (i.e. 65536).

    Proposed fix: Confirm that the multiplication will not overflow,
    before actually performing the multiplication and depending on the
    result.

    @jaybosamiya jaybosamiya mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-security A security issue labels Jun 13, 2017
    @serhiy-storchaka
    Copy link
    Member

    Thank you for your report Jay. Even if it very unlikely that this can occurred unintentionally or be used for attack, this still is a bug and should be fixed. Do you want to provide a patch?

    @jaybosamiya
    Copy link
    Mannequin Author

    jaybosamiya mannequin commented Jun 13, 2017

    I've made a patch that should fix the vulnerability. Please do let me know if changes are required. Thanks a lot :)

    PS: For anyone who looks at this later on, in my original message describing the issue, the line *p++ = *s++; should be marked as (4) instead to understand this issue better.

    @serhiy-storchaka serhiy-storchaka self-assigned this Jun 16, 2017
    @serhiy-storchaka
    Copy link
    Member

    New changeset c3c9db8 by Serhiy Storchaka (Jay Bosamiya) in branch '2.7':
    [2.7] bpo-30657: Check & prevent integer overflow in PyString_DecodeEscape (bpo-2174)
    c3c9db8

    @kirotawa
    Copy link
    Mannequin

    kirotawa mannequin commented Nov 23, 2017

    I couldn't reproduce using the poc, but seems python3.5 is also vulnerable to this bug. The code from py3.5 are quite similar to 2.7.
    In py3.5: Objects/bytesobject.c PyBytes_DecodeEscape

    @serhiy-storchaka
    Copy link
    Member

    Right, but it is not easy to exploit this bug. You need to parse Python sources longer than 512 MiB in 32-bit Python.

    Python 3.5 currently takes only fixes for security bugs. I left on to Larry to decide whether it is worth to port the fix to 3.5.

    @larryhastings
    Copy link
    Contributor

    I would welcome a backport of this for 3.5 and even 3.4 (if it's vulnerable, which it probably is).

    @kirotawa
    Copy link
    Mannequin

    kirotawa mannequin commented Nov 24, 2017

    Python 3.4 also has the similar code as 3.5, but applying the same patch tests for it results in test errors:

    +======================================================================                                                                                                                                  
    +ERROR: test_anydbm_creation (test.test_dbm.TestCase-dbm.ndbm)                                                                                                                                           
    +----------------------------------------------------------------------                                                                                                                                 
    +Traceback (most recent call last):                                                                                                                                                                    
    +  File "/<<PKGBUILDDIR>>/Lib/test/test_dbm.py", line 74, in test_anydbm_creation^M                                                                                                                         
    +    self.read_helper(f)^M                                                                                                                                                                                  
    +  File "/<<PKGBUILDDIR>>/Lib/test/test_dbm.py", line 115, in read_helper                                                                                                                                
    +    self.assertEqual(self._dict[key], f[key.encode("ascii")])                                                                                                                                           
    +KeyError: b'0'                                                                                                                                                                                         
    +                                                                                                                                                                                                        
    +======================================================================                                                                                                                                  
    +ERROR: test_anydbm_modification (test.test_dbm.TestCase-dbm.ndbm)                                                                                                                                        
    +----------------------------------------------------------------------                                                                                                                                 
    +Traceback (most recent call last):                                                                                                                                                                      
    +  File "/<<PKGBUILDDIR>>/Lib/test/test_dbm.py", line 89, in test_anydbm_modification                                                                                                                   
    +    self.read_helper(f)                                                                                                                                                                                
    +  File "/<<PKGBUILDDIR>>/Lib/test/test_dbm.py", line 115, in read_helper                                                                                                                                
    +    self.assertEqual(self._dict[key], f[key.encode("ascii")])                                                                                                                                           
    +KeyError: b'0'                                                                                                                                                                                          
    +                                                                                                                                                                                                       
    +======================================================================                                                                                                                                  
    +ERROR: test_anydbm_read (test.test_dbm.TestCase-dbm.ndbm)                                                                                                                                               
    +----------------------------------------------------------------------                                                                                                                                  
    +Traceback (most recent call last):                                                                                                                                                                      
    +  File "/<<PKGBUILDDIR>>/Lib/test/test_dbm.py", line 95, in test_anydbm_read                                                                                                                             
    +    self.read_helper(f)                                                                                                                                                                                 
    +  File "/<<PKGBUILDDIR>>/Lib/test/test_dbm.py", line 115, in read_helper                                                                                                                                
    +    self.assertEqual(self._dict[key], f[key.encode("ascii")])                                                                                                                                            
    +KeyError: b'0'

    @kirotawa
    Copy link
    Mannequin

    kirotawa mannequin commented Nov 28, 2017

    I re-did the build here for python3.4 and couldn't reach the same test fail. So I'm assuming it was a false alarm.
    Said that, I believe the same patch that applies to py2.7 also applies to 3.4 and 3.5. I've build them using the patch and did some regression tests and it was ok.

    @vstinner
    Copy link
    Member

    Leo kirotawa silva: "I re-did the build here for python3.4 and couldn't reach the same test fail. So I'm assuming it was a false alarm."

    Python 3.4 and 3.5 seem to be also vulnerable:
    ---

    PyObject *PyBytes_DecodeEscape(const char *s,
                                    Py_ssize_t len,
                                    const char *errors,
                                    Py_ssize_t unicode,
                                    const char *recode_encoding)
    {
        ...
        Py_ssize_t newlen = recode_encoding ? 4*len:len;
        v = PyBytes_FromStringAndSize((char *)NULL, newlen);

    I don't think that Python 3.6 and 3.7 are vulnerable, the code was rewritten with the _PyBytesWriter API. The code got a new _PyBytes_DecodeEscapeRecode() helper function which calls _PyBytesWriter_WriteBytes(), and this function detects properly integer overflows.

    @vstinner vstinner changed the title Unsafe arithmetic in PyString_DecodeEscape CVE-2017-1000158: Unsafe arithmetic in PyString_DecodeEscape Nov 29, 2017
    @vstinner vstinner reopened this Nov 29, 2017
    @vstinner vstinner changed the title CVE-2017-1000158: Unsafe arithmetic in PyString_DecodeEscape [security] CVE-2017-1000158: Unsafe arithmetic in PyString_DecodeEscape Nov 29, 2017
    @vstinner
    Copy link
    Member

    I added this vulnerability to the python-security website:
    http://python-security.readthedocs.io/vuln/cve-2017-1000158_pystring_decodeescape_integer_overflow.html

    @serhiy-storchaka
    Copy link
    Member

    I don't think it is worth to add this vulnerability to the python-security website. You need to compile a 1 GiB Python file on 32-bit system for reproducing it. It is very unlikely that this can happen by accident, and it is hard to used it in security attack. If you can make the attacked program compiling a 1 GiB Python file, you perhaps have easier ways to make a harm.

    @vstinner
    Copy link
    Member

    Serhiy: "I don't think it is worth to add this vulnerability to the python-security website. You need to compile a 1 GiB Python file on 32-bit system for reproducing it. It is very unlikely that this can happen by accident, and it is hard to used it in security attack. If you can make the attacked program compiling a 1 GiB Python file, you perhaps have easier ways to make a harm."

    I'm trying to keep track of all CVEs. People are scared by CVE numbers :-( But it seems like any bug can get a CVE number, without any real evalution of the severity of the bug.

    I completed the description on python-security with your paragraph.

    FYI I wrote python-security to make sure that vulnerabilities are fixed in supported Python branches. Here it seems like we forgot to fix Python 3.4 and 3.5.

    @larryhastings
    Copy link
    Contributor

    Resetting to "needs patch", because we still need PRs for 3.4 and 3.5 (please!).

    @larryhastings
    Copy link
    Contributor

    New changeset fd8614c by larryhastings (Miro Hrončok) in branch '3.5':
    bpo-30657: Fix CVE-2017-1000158 (bpo-4664)
    fd8614c

    @larryhastings
    Copy link
    Contributor

    New changeset 6c004b4 by larryhastings (Miro Hrončok) in branch '3.4':
    bpo-30657: Fix CVE-2017-1000158 (bpo-4758)
    6c004b4

    @larryhastings
    Copy link
    Contributor

    Merged into 3.4 and 3.5. Thanks for the patches!

    Since I see 2.7 has already had the fix committed, and these are the only three versions affected, I'm marking as closed / resolved / fixed.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    easy interpreter-core (Objects, Python, Grammar, and Parser dirs) type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants