classification
Title: Add 'bool' format character to PyArg_ParseTuple*
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: larry Nosy List: benjamin.peterson, eric.smith, georg.brandl, larry, loewis, mark.dickinson, python-dev, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2012-05-02 09:29 by larry, last changed 2012-05-07 09:45 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
larry.parse.tuple.p.and.P.1.diff larry, 2012-05-03 08:16 Patch adding 'p' and 'P' support to PyArg_ParseTuple etc. review
larry.parse.tuple.p.and.P.2.diff larry, 2012-05-04 12:16 review
larry.parse.tuple.p.and.P.3.diff larry, 2012-05-05 16:20 review
larry.parse.tuple.p.4.diff larry, 2012-05-05 18:16 review
Messages (34)
msg159781 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-02 09:29
Currently functions that parse their arguments with the PyArg_ParseTuple family which want to take a boolean-ish parameter face two choices:
  * take an "int", then test whether or not the int is 0, or
  * take an "object", then call PyObject_IsTrue themselves.

The former is foolish; though it works with ints and bools, it doesn't work with any other type (float, str, list, etc) which strictly speaking are valid for boolean fields.  And this is common enough that the latter should not be necessary.

I propose to add support for a new format character to the PyArg_ParseTuple family: 'b', which specifies 'boolean'.  Its
implementation would be much the same as that of 'd' except:
  * the output type would be "int" instead of "double",
  * it would check for a -1 instead of calling PyErr_Occured, and
  * it would call PyObject_IsTrue instead of PyFloat_AsDouble.


If we cared, I could also add 'B', which would only accept
either Py_True or Py_False.  But I see no reason why we'd ever want
to strictly enforce the type... do you?  (I can see MvL's argument now:
"We've lived this long without it.  YAGNI.")


If there's interest I'll knock out a patch.  I expect it to be less than ten lines not counting documentation and test.  (Less than twenty if
folks actually want 'B'.)
msg159783 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-02 10:39
Yes, I too have encountered this in the process of working on issue 14626. For this purpose it is appropriate to use a special converter (with 'O&'). I suppose it must be a strict сonverter; there is no point in specifying followlinks=[1, 2, 3]. If it is somewhere need a nonstrict ones -- use 'O' and then check the how to in this particular case.

Would be good to have a special letter for this format, but note that 'b' and 'B' are already used. It is better first to use the сonverter, and expand the format only when the enough number of uses.
msg159785 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-02 11:23
> For this purpose it is appropriate to use a special converter
> (with 'O&').

The converter works--but, then, a similar converter would also work
for double, and float, and long long and many others.  If long long
is special enough to merit explicit support in PyArg_ParseTuple,
then bool definitely is, as I strongly suspect bool is used much more
frequently.


> I suppose it must be a strict сonverter; there is no point
> in specifying followlinks=[1, 2, 3].

I don't see a need for that either, but--where would you draw the line?  What is the principle that says "ints are okay, floats are... okay, lists are not"?

Python has a long-established concept of the "truthiness" of an expression.  I *strongly* suggest we stick with that unless we have a *very* good reason why not.


> note that 'b' and 'B' are already used.

I'm a moron!  I must have looked right at it and it didn't register.

't' (truth) and 'f' (false) are also taken.  As is 'c' (conditional).
The only thing I can come up with that's remotely related and
isn't already taken is 'p' for predicate.


> It is better first to use the сonverter, and expand the format
> only when the enough number of uses.

I suspect there are already loads of places in the standard library
that should be using this rather than abusing 'i'.
msg159839 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-03 08:16
My first patch.  Adds 'p' and 'P', along with documentation and unit tests.
msg159844 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-03 10:10
Patch looks good to me.

I however prefer to use 'P'.
msg159916 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2012-05-04 10:22
The patch looks good to me.

Are there any places in the current code base that would use "P"? "p" seems the more useful case.

Are you planning on changing existing code to use P or p, or just use it going forward?
msg159918 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-05-04 10:38
I also think that 'P' looks too strict to be really useful.  I can't think of too many cases where I'd really want to insist on a boolean argument (and reject values of 0 or 1).

Maybe implement just 'p' for now and consider 'P' later?
msg159923 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-04 11:13
I looked through the Python sources and couldn't find any instances of a  function or method with an argument that only allowed you to pass in either True or False.

Serily already said he would use 'P' over 'p', although I too am unconvinced that's a good idea.  Serily: why would you unhesitatingly prefer 'P' to 'p'?


Certainly I see loads of uses for 'p'.  For example, when converting code from Python to C that already relied on Python's standard definition of truthiness.

I did find some spots that took an object and converted to bool with PyObject_IsTrue, like _json.Encoder(allow_nan) and pickle._Pickler(fix_imports).  These too would be well-served by 'p'.

I also found some funny in-between cases.  For example, stat_float_times and the three-argument form of os.symlink both claim to take a boolean but actually take 'i' (integer).  This is relying on bool.__int__().  We certainly couldn't use 'P' here.  We could consider switching these to 'p', though in all likelyhood we'll just leave 'em alone.
msg159926 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-05-04 11:43
I think there should be a test case also where PyObject_IsTrue gives an exception (which I think can happen if __bool__ raises an exception).
msg159927 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-04 11:45
> I think there should be a test case also where PyObject_IsTrue gives an
> exception (which I think can happen if __bool__ raises an exception).

I'd be happy to add such a test, but I don't know of any types like that.  Can anyone suggest one?
msg159928 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-05-04 11:48
>> I think there should be a test case also where PyObject_IsTrue gives an
>> exception (which I think can happen if __bool__ raises an exception).
>
> I'd be happy to add such a test, but I don't know of any types like  
> that.  Can anyone suggest one?

class NotTrue:
     def __bool__(self):
         raise NotImplementedError
msg159929 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-04 12:16
Added test forcing a failure for 'p' (and 'P').  This made me have to handle errors a little differently, so it was definitely good to test it.  Thanks for the suggestion, Martin!

Also changed wording in documentation ever-so-slightly.
msg159930 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2012-05-04 12:28
Now that I think about this some more, I think I'd structure the 'p' tests as:

for expr in [False, None, True, 1, 0]:  # add the rest
   self.assertEqual(bool(expr), getargs_p(expr))

Since the salient point is that 'p' returns the same value as bool(), right?

And for the one that raises an exception, you'll have to check that bool and getargs_p both raise the same type of exception.
msg159931 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-04 12:48
> Serily: why would you unhesitatingly prefer 'P' to 'p'?

My name is Serhiy. :)

'P' has the advantage that you can safely backward-compatibly remove the
restriction by replacing 'P' on 'p'. :)

> I also found some funny in-between cases.

This is historical legacy, some still use 1/0 instead True/False. In the
end, bool subclasses int. But we have no middlecase for latin 'P'.
msg159932 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-04 12:56
> Since the salient point is that 'p' returns the same value as bool(), right?

Yes, and bool_new() is a candidate number one for using new feature.
msg159933 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2012-05-04 13:00
If bool_new() is going to use 'p', then my suggestion shouldn't be the only test of 'p'.
msg159934 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-05-04 13:43
In this line in the patch (Python/getargs.c):

+        if (val == -1 || PyErr_Occurred()) {

Isn't that call to PyErr_Occurred() redundant?
msg160000 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-05 16:20
Attached is rev 3 of my patch, incorporating mainly backing out of dumb ideas.  Thanks for the feedback, Serhiy and Mark!


> My name is Serhiy. :)

My genuine apologies!  In my defense it was rather late.


> 'P' has the advantage that you can safely backward-compatibly
> remove the restriction by replacing 'P' on 'p'. :)

That's not a reason to use 'P'.  Why you should use 'P' in the first place?


> In this line in the patch (Python/getargs.c):
> +        if (val == -1 || PyErr_Occurred()) {
> Isn't that call to PyErr_Occurred() redundant?

Certainly one of the two expressions is!

My thinking was: if the call fails, then the val == -1 will be an early-exit and we can save the call to PyErr_Occurred.  This was always a dumb idea, as the 99.999999% case is that PyObject_IsTrue succeeds, in which case we would have called PyErr_Occured anyway.  Some savings!

I can't find any documentation on permitted return values from nb_bool.  However, PyObject_IsTrue itself says about its own return value:

    /* if it is negative, it should be either -1 or -2 */

So I definitely shouldn't check specifically for -1.

Having meditated on it, I think either I should either just call PyErr_Occured, check for explicit failure (val < 0), or explicit success (val >= 0).  I've opted for the last of those.


I considered briefly trying to make 'P' handle subclasses of bool.  But then I hit the problem of: okay, what now?  Call nb_bool?  I note that bool itself doesn't define nb_bool.  Anyway, what lunatic would subclass bool?


I'm really on the fence about 'P'.  Serhiy is definitely pro-, everyone else seems to think it shouldn't be used.  However nobody has argued against its inclusion.  At the moment I'm -0 on it myself, but since the code is written...

Do we have an anti-champion?
msg160001 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2012-05-05 16:24
Since no one has produced a good example needing "P", I say drop it. At any rate, it can be almost trivially imitated with "O!".
msg160003 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2012-05-05 16:37
Indeed, "because the code is written" is not a good argument if even you yourself are -0.
msg160005 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-05-05 16:45
> Having meditated on it, I think either I should either just call
> PyErr_Occured, check for explicit failure (val < 0), or explicit success
> (val >= 0).  I've opted for the last of those.

Yes, I think that works; it avoids a relatively expensive PyErr_Occurred() call in the non-failure case.  The new code looks fine to me!
msg160006 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-05-05 16:50
> I considered briefly trying to make 'P' handle subclasses of bool.

Not an issue:  bool can't be subclassed. :-)

Python 3.2.3 (default, Apr 13 2012, 00:15:25) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> class MyBool(bool):
...     pass
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: type 'bool' is not an acceptable base type
msg160007 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-05 17:12
> That's not a reason to use 'P'.  Why you should use 'P' in the first place?

I just guided by the principle "Explicit is better than implicit".
Statements 'if' and 'while' I consider explicit and expect cast to bool.
Implicit cast to bool in the transfer of parameter causes discomfort (no
one (no one Pythonist) expects the implicit cast to str). Unfortunately,
for historical reasons, there is a lot of code, where the parameters are
casted to bool or int is used instead of bool. Therefore, we cannot
simply enter the restrictions.

Well, I was certainly wrong. Don't let me mislead you.

> Anyway, what lunatic would subclass bool?

Class bool cannot be subclassed further.
msg160008 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-05 17:19
Serhiy, I'm having a little trouble with your English.  (But I'm sure your English is far better than my... uh, anything besides English.)  So let me ask a question with a clear yes/no answer:

Do you still think 'P' is better than 'p'?
msg160013 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-05 17:55
> Do you still think 'P' is better than 'p'?

No.

I hope I haven't made a lot of mistakes in the previous sentence.
msg160016 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-05 18:16
> I hope I haven't made a lot of mistakes in the previous sentence.

It depends, do you consider three "a lot"?  ;-)


Attached is a new patch removing 'P'.  (The regrtest is still running but I don't expect any failures.)  I'm guessing I won't get any further feedback.  So unless I hear otherwise I'll check it in tomorrow.
msg160019 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-05-05 19:00
Latest patch looks good to me.
msg160044 - (view) Author: Roundup Robot (python-dev) Date: 2012-05-05 23:55
New changeset bc6d28e726d8 by Larry Hastings in branch 'default':
Issue #14705: Add 'p' format character to PyArg_ParseTuple* for bool support.
http://hg.python.org/cpython/rev/bc6d28e726d8
msg160045 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-05 23:56
Eh, it was ready, why wait?  Thanks everybody for your feedback!
msg160046 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012-05-06 00:15
I would have expected a bool parse code to insist on a boolean, so that:

    int x;
    PyArg_ParseTupleAndKeywords(args, kwds, "p:func", &x);

would behave the same as:

    PyObject  *o;
    int x;
    PyArg_ParseTupleAndKeywords(args, kwds, "O!:func", &PyBool_Type, &o);
    x = o == Py_True;
msg160047 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2012-05-06 00:21
> I would have expected a bool parse code to insist on a boolean,

I originally had a second form ('P') that insisted on a boolean as you suggest.  But nobody could come up with a use case.  So I removed it in the final patch.  Please see this issue for the discussion.

If you have a use case for it I'd be happy to revive it and check it in.
msg160049 - (view) Author: Roundup Robot (python-dev) Date: 2012-05-06 00:40
New changeset 709850f1ec67 by Larry Hastings in branch 'default':
Update Misc/NEWS for issues #14127 and #14705.  (And, technically, #10148.)
http://hg.python.org/cpython/rev/709850f1ec67
msg160056 - (view) Author: Roundup Robot (python-dev) Date: 2012-05-06 04:58
New changeset 05274ab06182 by Larry Hastings in branch 'default':
Update Misc/NEWS for issues #14127 and #14705.  (And, technically, #10148.)
http://hg.python.org/cpython/rev/05274ab06182
msg160127 - (view) Author: Roundup Robot (python-dev) Date: 2012-05-07 09:45
New changeset e4617650f006 by Larry Hastings in branch 'default':
Issue #14705: Added support for the new 'p' format unit to skipitem().
http://hg.python.org/cpython/rev/e4617650f006
History
Date User Action Args
2012-05-07 09:45:44python-devsetmessages: + msg160127
2012-05-06 06:03:22larrysetstatus: open -> closed
2012-05-06 04:58:45python-devsetmessages: + msg160056
2012-05-06 01:09:58larrysetresolution: fixed
stage: patch review -> resolved
2012-05-06 00:40:16python-devsetmessages: + msg160049
2012-05-06 00:21:18larrysetmessages: + msg160047
2012-05-06 00:15:34rhettingersetnosy: + rhettinger
messages: + msg160046
2012-05-05 23:56:29larrysetmessages: + msg160045
2012-05-05 23:55:39python-devsetnosy: + python-dev
messages: + msg160044
2012-05-05 19:00:07mark.dickinsonsetmessages: + msg160019
2012-05-05 18:16:59larrysetfiles: + larry.parse.tuple.p.4.diff

messages: + msg160016
2012-05-05 17:55:22serhiy.storchakasetmessages: + msg160013
2012-05-05 17:19:38larrysetmessages: + msg160008
2012-05-05 17:12:48serhiy.storchakasetmessages: + msg160007
2012-05-05 16:50:50mark.dickinsonsetmessages: + msg160006
2012-05-05 16:45:37mark.dickinsonsetmessages: + msg160005
2012-05-05 16:37:28georg.brandlsetnosy: + georg.brandl
messages: + msg160003
2012-05-05 16:24:58benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg160001
2012-05-05 16:20:39larrysetfiles: + larry.parse.tuple.p.and.P.3.diff

messages: + msg160000
2012-05-04 13:43:18mark.dickinsonsetmessages: + msg159934
2012-05-04 13:00:43eric.smithsetmessages: + msg159933
2012-05-04 12:56:45serhiy.storchakasetmessages: + msg159932
2012-05-04 12:48:56serhiy.storchakasetmessages: + msg159931
2012-05-04 12:28:31eric.smithsetmessages: + msg159930
2012-05-04 12:16:54larrysetfiles: + larry.parse.tuple.p.and.P.2.diff

messages: + msg159929
2012-05-04 11:48:22loewissetmessages: + msg159928
2012-05-04 11:45:53larrysetmessages: + msg159927
2012-05-04 11:43:59loewissetnosy: + loewis
messages: + msg159926
2012-05-04 11:13:58larrysetmessages: + msg159923
2012-05-04 10:38:22mark.dickinsonsetnosy: + mark.dickinson
messages: + msg159918
2012-05-04 10:22:04eric.smithsetnosy: + eric.smith
messages: + msg159916
2012-05-03 10:10:50serhiy.storchakasetmessages: + msg159844
2012-05-03 08:16:19larrysetfiles: + larry.parse.tuple.p.and.P.1.diff
keywords: + patch
messages: + msg159839

stage: patch review
2012-05-02 11:23:19larrysetmessages: + msg159785
2012-05-02 10:39:44serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg159783
2012-05-02 09:29:19larrycreate