classification
Title: a SystemError and an assertion failure in warnings.warn_explicit()
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.7, Python 3.6, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Oren Milman, brett.cannon, eric.snow, ncoghlan, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-08-26 18:48 by Oren Milman, last changed 2017-09-30 14:07 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 3219 merged Oren Milman, 2017-08-27 18:33
PR 3775 merged serhiy.storchaka, 2017-09-26 23:30
PR 3803 merged Oren Milman, 2017-09-28 14:30
PR 3805 closed Oren Milman, 2017-09-28 15:43
PR 3823 merged Oren Milman, 2017-09-29 15:21
PR 3829 merged python-dev, 2017-09-29 18:16
Messages (18)
msg300892 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-08-26 18:48
1.
the following causes an assertion failure in Python/_warnings.c in
show_warning():

import warnings

class BadLoader:
    def get_source(self, fullname):
        class BadSource:
            def splitlines(self):
                return [42]
        return BadSource()

del warnings._showwarnmsg
warnings.warn_explicit(message='foo', category=ArithmeticError, filename='bar',
                       lineno=1, module_globals={'__loader__': BadLoader(),
                                                 '__name__': 'foobar'})

in short, the assertion failure would happen in warnings.warn_explicit() in case
module_globals['__loader__'].get_source(module_globals['__name__']).splitlines()[lineno-1]
is not a str.


2.
the following raises a SystemError:

import warnings

class BadLoader:
    def get_source(self, fullname):
        class BadSource:
            def splitlines(self):
                return 42
        return BadSource()

warnings.warn_explicit(message='foo', category=UserWarning, filename='bar',
                       lineno=42, module_globals={'__loader__': BadLoader(),
                                                  '__name__': 'foobar'})

in short, warnings.warn_explicit() raises the SystemError in case
module_globals['__loader__'].get_source(module_globals['__name__']).splitlines()
is not a list.



ISTM that adding a check in warnings_warn_explicit() (in Python/_warnings.c),
to check whether
module_globals['__loader__'].get_source(module_globals['__name__'])
is a str (after existing code found out that it isn't None) would prevent both
the assertion failure and the SystemError.

What do you think? Is it OK to permit get_source() to return only None or a str?
msg300900 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-08-26 19:35
on a second thought, BadSource could be a subclass of str, so maybe we
should just check whether
module_globals['__loader__'].get_source(module_globals['__name__']).splitlines()[lineno-1]
is a str,
and whether
module_globals['__loader__'].get_source(module_globals['__name__']).splitlines()
is a list.
msg300940 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-08-28 06:42
An alternative solution is using str.splitlines(source) instead of source.splitlines(). It raises a TypeError if the argument is not a text string and always returns a list of strings.

According to the documentation get_source() must return a text string or None.

https://docs.python.org/3/library/importlib.html?highlight=get_source#importlib.abc.InspectLoader.get_source
msg300950 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-08-28 10:54
ISTM that your solution is better than mine, Serhiy, so I updated the PR.
msg302876 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-24 18:27
New changeset 91fb0afe181986b48abfc6092dcca912b39de51d by Serhiy Storchaka (Oren Milman) in branch 'master':
bpo-31285: Fix an assertion failure and a SystemError in warnings.warn_explicit. (#3219)
https://github.com/python/cpython/commit/91fb0afe181986b48abfc6092dcca912b39de51d
msg303092 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-27 04:38
New changeset 90fe25a051bd8cf64d52be533c9b60cad9bdd7fb by Serhiy Storchaka in branch '3.6':
[3.6] bpo-31285: Fix an assertion failure and a SystemError in warnings.warn_explicit. (GH-3219) (#3775)
https://github.com/python/cpython/commit/90fe25a051bd8cf64d52be533c9b60cad9bdd7fb
msg303258 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-28 15:59
PyUnicode_Splitlines() cannot be used here in 2.7, because the source is likely a 8-bit string.

Actually I'm not sure this issue is worth to be fixed in 2.7. The fix would be different from 3.x and more complex.
msg303260 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-09-28 16:19
In 2.7, PyUnicode_Splitlines() first does:
string = PyUnicode_FromObject(string);

So i thought that PyUnicode_Splitlines() would be fine with receiving a string.

But now i realize that even in case i was right there, PyUnicode_Splitlines()
returns a unicode, and not a string, so there should be problems later.
I wonder how the tests still passed..
msg303262 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-28 16:44
> In 2.7, PyUnicode_Splitlines() first does:
> string = PyUnicode_FromObject(string);

And it raises an exception if the string contains non-ASCII characters.

It is better to avoid str<->unicode convertion as long as possible. And when 
do it for the output a warning, use "replace" or "backslashreplace" error 
handlers or "latin1" decoder to avoid a failure.
msg303298 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-09-29 05:41
Another thought - the existing code assumes that splitlines() returned a string.
So maybe we could just check that get_source() returned a string, and then call
the method str.splitlines() on it?
msg303300 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-09-29 05:58
oh, of course, checking that get_source() returned a string before passing it to
str.splitlines() is not needed.
msg303303 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-29 06:33
What if get_source() returned a unicode string? Usually it returns 8-bit string, but in many cases unicode is accepted if str is expected, so you need to check this option too.
msg303306 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-09-29 07:48
But in case get_source() returned a unicode, is it likely that the splitlines() method
of this unicode would return a 8-bit string? Currently show_warning() doesn't handle this
scenario, as it assumes splitlines() returned an 8-bit string. Or do you think that
show_warning() should also accept unicode?
msg303308 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-29 07:59
I didn't check. I supposed that the other code can support unicode by 
implicitly converting it to str. But now I see that show_warning() uses 
PyString_AS_STRING() and therefore supports only str. I agree, using 
str.splitlines() would be correct solution.
msg303349 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-29 18:16
New changeset 8b4ff53c440dfcde40fbeb02c5e666c85190528f by Serhiy Storchaka (Oren Milman) in branch 'master':
bpo-31285: Remove splitlines identifier from Python/_warnings.c (#3803)
https://github.com/python/cpython/commit/8b4ff53c440dfcde40fbeb02c5e666c85190528f
msg303350 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-29 18:18
I just missed PR 3803.
msg303353 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-29 19:26
New changeset 66c2b9f13ef2197a5212fd58372173124df76467 by Serhiy Storchaka (Miss Islington (bot)) in branch '3.6':
[3.6] bpo-31285: Remove splitlines identifier from Python/_warnings.c (GH-3803) (#3829)
https://github.com/python/cpython/commit/66c2b9f13ef2197a5212fd58372173124df76467
msg303410 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-30 14:06
New changeset 40d736bcf40c10aa5fc7d6cc305a6641afd3cd0e by Serhiy Storchaka (Oren Milman) in branch '2.7':
[2.7] bpo-31285: Don't raise a SystemError in warnings.warn_explicit() in case __loader__.get_source() has a bad splitlines() method. (GH-3219) (#3823)
https://github.com/python/cpython/commit/40d736bcf40c10aa5fc7d6cc305a6641afd3cd0e
History
Date User Action Args
2017-09-30 14:07:34serhiy.storchakasetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: + Python 2.7
2017-09-30 14:06:58serhiy.storchakasetmessages: + msg303410
2017-09-29 19:26:47serhiy.storchakasetmessages: + msg303353
2017-09-29 18:18:17serhiy.storchakasetmessages: + msg303350
2017-09-29 18:16:16python-devsetpull_requests: + pull_request3812
2017-09-29 18:16:04serhiy.storchakasetmessages: + msg303349
2017-09-29 15:21:45Oren Milmansetpull_requests: + pull_request3808
2017-09-29 07:59:55serhiy.storchakasetmessages: + msg303308
2017-09-29 07:48:45Oren Milmansetmessages: + msg303306
2017-09-29 06:33:07serhiy.storchakasetmessages: + msg303303
2017-09-29 05:58:29Oren Milmansetmessages: + msg303300
2017-09-29 05:41:06Oren Milmansetmessages: + msg303298
2017-09-28 16:44:41serhiy.storchakasetmessages: + msg303262
2017-09-28 16:19:35Oren Milmansetmessages: + msg303260
2017-09-28 15:59:09serhiy.storchakasetmessages: + msg303258
versions: + Python 3.6
2017-09-28 15:43:59Oren Milmansetpull_requests: + pull_request3791
2017-09-28 14:30:31Oren Milmansetpull_requests: + pull_request3789
2017-09-27 04:38:06serhiy.storchakasetmessages: + msg303092
2017-09-26 23:30:15serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request3762
2017-09-24 18:27:13serhiy.storchakasetmessages: + msg302876
2017-08-28 10:54:33Oren Milmansetmessages: + msg300950
2017-08-28 06:42:58serhiy.storchakasetnosy: + eric.snow, serhiy.storchaka, brett.cannon, ncoghlan
messages: + msg300940
2017-08-27 18:33:38Oren Milmansetpull_requests: + pull_request3259
2017-08-26 19:35:53Oren Milmansetmessages: + msg300900
2017-08-26 18:48:22Oren Milmancreate