This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: modsupport: 'countformat' does not handle strings without bracket levels correctly
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: martin.panter, myronww, r.david.murray
Priority: normal Keywords:

Created on 2015-12-07 21:55 by myronww, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (7)
msg256075 - (view) Author: Myron Walker (myronww) * Date: 2015-12-07 21:55
'countformat' does not appear to be handling the case where a format string is passed with no parenthesis or brackets correctly. Here is an
example of a usage that might cause issues from typedobject.c:

static PyObject*
slot_sq_slice(PyObject *self, Py_ssize_t i, Py_ssize_t j)
{
	static PyObject *getslice_str;

	if (PyErr_WarnPy3k("in 3.x, __getslice__ has been removed; "
			    "use __getitem__", 1) < 0)
		return NULL;
	return call_method(self, "__getslice__", &getslice_str,
		"nn", i, j);      <<<<<<< Maybe Bad Format Str <<<<<<<
}

The format string "nn" does not have any level markers so when it gets
processed by 'countformat' the count will be incremented 2 times by the
'default' case but when the end of the string is reached and the NULL character is processed bay "case '\0':".  The function will ignore the count variable and just return -1 anyway.  The error created is unmatched paren in format but 'level' is never checked to see if a paren
was even hit to begin with.

It might be that the case should be changed to look at level before assuming a error condition if strings are supposed to be processed as the one in the example above.  case '\0' should probably be doing something like:

         case '\0':
            if (level > 0) {   // Check If Level Was Incremented
                /* Premature end */
                PyErr_SetString(PyExc_SystemError,
                                "unmatched paren in format");
                return -1;
            }
            break;
msg256078 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-12-07 22:00
Can you post a reproducer?
msg256085 - (view) Author: Myron Walker (myronww) * Date: 2015-12-07 23:05
There is not reproducer for this currently.

Its a case of implied or shared usage between a function and the code that uses it.  This function is only used in the Python core, so unless
it is used incorrectly, by a python extension or modification, it may never hit the error condition.

The case where this will fail is if for some reason it is called with
a 'endchar' other than '\0'.  The function assumes end char will be '\0', ')', ']', and ']'.
msg256089 - (view) Author: Myron Walker (myronww) * Date: 2015-12-07 23:26
There are format string like "O&O&O&O&O&" being passed in to Py_BuildValue which eventually make their way to 'countformat'.  It looks like it would just break after the first & and not count anything else.  So Im not sure it gives an accurate count.
msg256090 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-12-07 23:33
I am not seeing any problem. The function is declared static, and is only called with nonzero endchar from do_mkvalue() that I can see. It should not be directly callable from elsewhere.
msg256093 - (view) Author: Myron Walker (myronww) * Date: 2015-12-08 00:03
Yes, there are some other cases that look odd to me as if the code is not up to date though.  I was looking at the documentation and it mentions format character combinations like 'es' which contain two characters sequences.  When I look at this function, it does not handle two character sequences so the count in some cases may not be correct.  Im not familiar with the code enough to know how critical the count might be.
msg256099 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-12-08 00:19
The breaking for “&” is to keep the count right, because the preceding “O” would already have been counted as one. The “es” code is only documented for parsing values, not for building values. I suggest to close this, unless you can find a specific bug.
History
Date User Action Args
2022-04-11 14:58:24adminsetgithub: 70003
2021-04-27 22:04:50iritkatrielsetstatus: open -> closed
resolution: not a bug
stage: resolved
2015-12-08 00:19:18martin.pantersetmessages: + msg256099
2015-12-08 00:03:12myronwwsetmessages: + msg256093
2015-12-07 23:33:16martin.pantersetnosy: + martin.panter
messages: + msg256090
2015-12-07 23:26:28myronwwsetmessages: + msg256089
2015-12-07 23:05:47myronwwsetmessages: + msg256085
2015-12-07 22:00:17r.david.murraysetnosy: + r.david.murray
messages: + msg256078
2015-12-07 21:55:15myronwwcreate