classification
Title: Avoid possible errors in comparing strings
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 3.7, Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: facundobatista, mark.dickinson, python-dev, rhettinger, serhiy.storchaka, skrah, vstinner, xiang.zhang
Priority: normal Keywords: patch

Created on 2017-01-07 07:10 by serhiy.storchaka, last changed 2017-01-09 09:42 by skrah. This issue is now closed.

Files
File name Uploaded Description Edit
unicode_compare.patch serhiy.storchaka, 2017-01-07 07:10 review
unicode_compare_2.patch serhiy.storchaka, 2017-01-07 08:27 review
Messages (11)
msg284895 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-07 07:10
PyUnicode_Compare() and PyUnicode_RichCompare() can raise an exception if one of arguments is not ready unicode object. The result is not always checked for error. Proposed patch gets rid of possible bugs. PyUnicode_Compare() and PyUnicode_RichCompare() in Modules/_pickle.c are replaced with _PyUnicode_EqualToASCIIString() and _PyUnicode_EqualToASCIIId() which never fail. Additional check is added in Modules/_decimal/_decimal.c to ensure that the string which is came from a user code is ready.

All other occurrences of PyUnicode_Compare() seems are called only with ready unicode objects.
msg284899 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-07 08:27
An alternative patch checks the result of PyUnicode_Compare() (as Xiang suggested) instead of checking the value before calling PyUnicode_Compare(). Stephan, what way do you prefer?
msg284904 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-01-07 09:21
Paste my point here:

I prefer checking the result PyUnicode_Compare to see it's a success or failure.
getrandom doesn't use any lower level unicode operations so it doesn't get a
responsibility to ready the unicode.
msg284915 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-01-07 14:09
Quite honestly I prefer to do nothing. What is the worst that can happen?
A SystemError?

Not-ready unicode strings are an application bug.
msg284916 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-01-07 14:14
Also, if anyone changes the rounding-mode constants it is really their problem.
msg284917 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-07 14:38
In worst case ignoring an error can cause a mystical error later during executing an unrelated code or a crash in debug build. In both cases it is hard to find the location of the bug.
msg284918 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-01-07 14:42
To expand a little, you use the terminology "possible bugs". All I can
see is a double exception if PyUnicode_Compare() returns -1.

I think I did it on purpose because this function is speed sensitive and
no user will change a valid rounding mode *constant* to a string that
is not ready.

So getting a double exception after deliberately sabotaging the module
seems pretty benign to me. :)


Consider the decimal part rejected.
msg284923 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-01-07 16:04
I'm generally a little concerned about the way "bugs" are presented here recently:

In #28701 you write:

'Correctness. Since no caller checks the error of PyUnicode_CompareWithASCIIString(), it is incorrectly interpreted as "less then".'

This is just not true. When testing for equality "-1" is "not equal"
and an exception would follow anyway.

I'm also not happy with broad coccinelle patches that I see months later.

I'm not sure if you realize that other people's reputation is at
stake here. You label something as a bug (good for you), everyone
who reads your reports and commits think that a bug has been found,
which is not the case.
msg285026 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2017-01-09 08:10
New changeset 337461574c90 by Serhiy Storchaka in branch '3.5':
Issue #29190: Fixed possible errors in comparing strings in the pickle module.
https://hg.python.org/cpython/rev/337461574c90

New changeset 9fcff936f61f by Serhiy Storchaka in branch '3.6':
Issue #29190: Fixed possible errors in comparing strings in the pickle module.
https://hg.python.org/cpython/rev/9fcff936f61f

New changeset f477c715076c by Serhiy Storchaka in branch 'default':
Issue #29190: Fixed possible errors in comparing strings in the pickle module.
https://hg.python.org/cpython/rev/f477c715076c
msg285027 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-09 08:21
In the particular case of getround() in _decimal.c, seems the worst case is raising TypeError instead of MemoryError in pretty rare circumstances. This is not critically bad, there are a lot of other places where the initial exception is silently replaced by less specific exception. But the code *looks* fragile.
msg285029 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-01-09 09:42
On Mon, Jan 09, 2017 at 08:21:17AM +0000, Serhiy Storchaka wrote:
> In the particular case of getround() in _decimal.c, seems the worst case is raising TypeError instead of MemoryError in pretty rare circumstances. This is not critically bad, there are a lot of other places where the initial exception is silently replaced by less specific exception. But the code *looks* fragile.

No, it does not.  It is obvious to a human that -1 <==> "not equal".

If Coccinelle does not understand that, well ...
History
Date User Action Args
2017-01-09 09:42:45skrahsetmessages: + msg285029
2017-01-09 08:21:17serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg285027

stage: patch review -> resolved
2017-01-09 08:10:41python-devsetnosy: + python-dev
messages: + msg285026
2017-01-07 16:04:40skrahsetmessages: + msg284923
2017-01-07 14:42:29skrahsetmessages: + msg284918
2017-01-07 14:38:30serhiy.storchakasetnosy: + vstinner
messages: + msg284917
2017-01-07 14:14:20skrahsetmessages: + msg284916
2017-01-07 14:09:48skrahsetmessages: + msg284915
2017-01-07 09:21:58xiang.zhangsetnosy: + xiang.zhang
messages: + msg284904
2017-01-07 08:27:49serhiy.storchakasetfiles: + unicode_compare_2.patch
nosy: + rhettinger, facundobatista, mark.dickinson, skrah
messages: + msg284899

2017-01-07 07:10:06serhiy.storchakacreate