Message 416814 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	tfish2
Recipients	tfish2
Date	2022-04-05.20:26:08
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1649190368.71.0.65167976325.issue47234@roundup.psfhosted.org>
In-reply-to

Content
Here is a major general problem with python-static-typing as it is described by PEP-484: The approach described in https://peps.python.org/pep-0484/#the-numeric-tower negatively impacts our ability to reason about the behavior of code with stringency. I would like to clarify one thing in advance: this is a real problem if we subscribe to some of the important ideas that Dijkstra articulated in his classic article "On the role of scientific thought" (e.g.: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD447.html). Specifically, this part: """ Let me try to explain to you, what to my taste is characteristic for all intelligent thinking. It is, that one is willing to study in depth an aspect of one's subject matter in isolation for the sake of its own consistency, all the time knowing that one is occupying oneself only with one of the aspects. We know that a program must be correct and we can study it from that viewpoint only; we also know that it should be efficient and we can study its efficiency on another day, so to speak. In another mood we may ask ourselves whether, and if so: why, the program is desirable. But nothing is gained —on the contrary!— by tackling these various aspects simultaneously. It is what I sometimes have called "the separation of concerns", which, even if not perfectly possible, is yet the only available technique for effective ordering of one's thoughts, that I know of. This is what I mean by "focussing one's attention upon some aspect": it does not mean ignoring the other aspects, it is just doing justice to the fact that from this aspect's point of view, the other is irrelevant. It is being one- and multiple-track minded simultaneously. """ So, "code should be easy to reason about". Now, let us look at this function - I am here (mostly) following the Google Python style guide (https://google.github.io/styleguide/pyguide.html) for now: === Example 1, original form === def middle_mean(xs): """Compute the average of the nonterminal elements of `xs`. Args: `xs`: a list of floating point numbers. Returns: A float, the mean of the elements in `xs[1:-1]`. Raises: ValueError: If `len(xs) < 3`. """ if len(xs) < 3: raise ValueError('Need at least 3 elements to compute middle mean.') return sum(xs[1:-1]) / (len(xs) - 2) ====== Let's not discuss performance, or whether it makes sense to readily generalize this to operate on other sequences than lists, but focus, following Dijkstra, on one specific concern here: Guaranteed properties. Given the function as it is above, I can make statements that are found to be correct when reasoning with mathematical rigor, such as this specific one that we will come back to: === Theorem 1 === If we have an object X that satisfies these properties...: 1. type(X) is list 2. len(X) == 4 3. all(type(x) is float for x in X) ...then we are guaranteed that `middle_mean(X)` evaluates to a value Y which satisfies: - type(Y) is float - Y == (X[1] + X[2]) * 0.5 or math.isnan(Y) === Now, following PEP-484, we would want to re-write our function, adding type annotations. Doing this mechanically would give us: === Example 1, with mechanically added type information === def middle_mean(xs: List[float]) -> float: """Compute the average of the nonterminal elements of `xs`. Args: `xs`: a list of floating point numbers. Returns: A float, the mean of the elements in `xs[1:-1]`. Raises: ValueError: If `len(xs) < 3`. """ if len(xs) < 3: raise ValueError('Need at least 3 elements to compute middle mean.') return sum(xs[1:-1]) / (len(xs) - 2) ====== (We are also deliberately not discussing another question here: given this documentation and type annotation, should the callee be considered to be permitted to mutate the input list?) So, given the above form, we now find that there seems to be quite a bit of redundancy here. After all, we have the type annotation but also repeat some typing information in the docstring. Hence, the obvious proposal here is to re-write the above definition again, obtaining: === Example 1, "cleaned up" === def middle_mean(xs: List[float]) -> float: """Compute the average of the nonterminal elements of `xs`. Args: `xs`: numbers to average, with terminals ignored. Returns: The mean of the elements in `xs[1:-1]`. Raises: ValueError: If `len(xs) < 3`. """ if len(xs) < 3: raise ValueError('Need at least 3 elements to compute middle mean.') return sum(xs[1:-1]) / (len(xs) - 2) ====== But now, what does this change mean for the contract? Part of the "If arguments have these properties, then these are the guarantees" contract now is no longer spelled out by the docstring, but via type annotations. We naturally would expect this to be straightforward, so, we would like to have: === Theorem 1b === If we have an object X that satisfies these properties...: 1. Static type analysis tells us that X has the appropriate type for being passed into middle_mean(). 2. len(X) == 4 ...then we are guaranteed that `middle_mean(X)` evaluates to a value Y which satisfies: - Static type analysis can tell us that this value is a float - Y == (X[1] + X[2]) * 0.5 or math.isnan(Y) === ...but the actual situation is: a) This is not guaranteed. b) PEP-484 does not seem to leave us any option to amend the type annotation so that we can obtain a corresponding guarantee. Specifically, this Python3+ example breaks it - note that PEP-484 states that: "when an argument is annotated as having type float, an argument of type int is acceptable". >>> middle_mean([0.0, 1.25, 10**1000, 0.0]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 15, in middle_mean OverflowError: int too large to convert to float So, even the "this function call evaluates to..." is violated here - it does not evaluate to anything, but raises an exception instead. One option to address this would be to stick with the code as in [Example 1, with mechanically added type information], but I would argue that it is highly confusing that `float` in the static type annotation and `float` in the docstring then refer to two very different concepts. This is certainly not desirable (and we should not expect Python practitioners to do it that way)! Now, while this may superficially look like a weird edge case, that is actually not so - the problem is that such mismatches in some situations "allow us to drive in a wedge" and create conditions where there is a deviation between actual and expected program behavior. This is a common starting point for writing exploits, as for example these classic articles nicely illustrate: "Smashing The Stack For Fun And Profit" (aleph1@underground.org, Phrack #49) (E.g.: https://inst.eecs.berkeley.edu/~cs161/fa08/papers/stack_smashing.pdf) "Delivering Signals for Fun and Profit" (Michal Zalewski) (E.g.: https://lcamtuf.coredump.cx/signals.txt) CAPEC-71: Using Unicode Encoding to Bypass Validation Logic https://capec.mitre.org/data/definitions/71.html So, clearly, we want to not only maintain the ability to reason about the behavior of Python code with mathematical stringency, but also make this reasonably simple and straightforward. I would also argue that, given the popularity of Python as a CS education language, it matters that professors have a viable substrate for teaching the skill to stringently reason about the behavior of code.

Here is a major general problem with python-static-typing as it is
described by PEP-484: The approach described in
https://peps.python.org/pep-0484/#the-numeric-tower negatively impacts
our ability to reason about the behavior of code with stringency.

I would like to clarify one thing in advance: this is a real problem
if we subscribe to some of the important ideas that Dijkstra
articulated in his classic article "On the role of scientific thought"
(e.g.: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD447.html).
Specifically, this part:

"""
Let me try to explain to you, what to my taste is characteristic
for all intelligent thinking. It is, that one is willing to study in
depth an aspect of one's subject matter in isolation for the sake of
its own consistency, all the time knowing that one is occupying
oneself only with one of the aspects. We know that a program must be
correct and we can study it from that viewpoint only; we also know
that it should be efficient and we can study its efficiency on another
day, so to speak. In another mood we may ask ourselves whether, and if
so: why, the program is desirable. But nothing is gained —on the
contrary!— by tackling these various aspects simultaneously. It is
what I sometimes have called "the separation of concerns", which, even
if not perfectly possible, is yet the only available technique for
effective ordering of one's thoughts, that I know of. This is what I
mean by "focussing one's attention upon some aspect": it does not mean
ignoring the other aspects, it is just doing justice to the fact that
from this aspect's point of view, the other is irrelevant. It is being
one- and multiple-track minded simultaneously.
"""

So, "code should be easy to reason about".

Now, let us look at this function - I am here (mostly) following the
Google Python style guide (https://google.github.io/styleguide/pyguide.html) for now:

=== Example 1, original form ===

def middle_mean(xs):
  """Compute the average of the nonterminal elements of `xs`.

  Args:
    `xs`: a list of floating point numbers.

  Returns:
    A float, the mean of the elements in `xs[1:-1]`.

  Raises:
    ValueError: If `len(xs) < 3`.
  """
  if len(xs) < 3:
    raise ValueError('Need at least 3 elements to compute middle mean.')
  return sum(xs[1:-1]) / (len(xs) - 2)
======

Let's not discuss performance, or whether it makes sense to readily
generalize this to operate on other sequences than lists, but focus,
following Dijkstra, on one specific concern here: Guaranteed properties.

Given the function as it is above, I can make statements that are
found to be correct when reasoning with mathematical rigor, such as
this specific one that we will come back to:

=== Theorem 1 ===
  If we have an object X that satisfies these properties...:

  1. type(X) is list
  2. len(X) == 4
  3. all(type(x) is float for x in X)

  ...then we are guaranteed that `middle_mean(X)` evaluates to a value Y
  which satisfies:

  - type(Y) is float
  - Y == (X[1] + X[2]) * 0.5 or math.isnan(Y)
===

Now, following PEP-484, we would want to re-write our function, adding type annotations.
Doing this mechanically would give us:

=== Example 1, with mechanically added type information ===
def middle_mean(xs: List[float]) -> float:
  """Compute the average of the nonterminal elements of `xs`.

  Args:
    `xs`: a list of floating point numbers.

  Returns:
    A float, the mean of the elements in `xs[1:-1]`.

  Raises:
    ValueError: If `len(xs) < 3`.
  """
  if len(xs) < 3:
    raise ValueError('Need at least 3 elements to compute middle mean.')
  return sum(xs[1:-1]) / (len(xs) - 2)
======

(We are also deliberately not discussing another question here: given
this documentation and type annotation, should the callee be
considered to be permitted to mutate the input list?)


So, given the above form, we now find that there seems to be quite a
bit of redundancy here. After all, we have the type annotation but
also repeat some typing information in the docstring. Hence, the
obvious proposal here is to re-write the above definition again, obtaining:

=== Example 1, "cleaned up" ===
def middle_mean(xs: List[float]) -> float:
  """Compute the average of the nonterminal elements of `xs`.

  Args:
    `xs`: numbers to average, with terminals ignored.

  Returns:
    The mean of the elements in `xs[1:-1]`.

  Raises:
    ValueError: If `len(xs) < 3`.
  """
  if len(xs) < 3:
    raise ValueError('Need at least 3 elements to compute middle mean.')
  return sum(xs[1:-1]) / (len(xs) - 2)
======

But now, what does this change mean for the contract? Part of the "If
arguments have these properties, then these are the guarantees"
contract now is no longer spelled out by the docstring, but via type
annotations. We naturally would expect this to be straightforward,
so, we would like to have:

=== Theorem 1b ===
  If we have an object X that satisfies these properties...:

  1. Static type analysis tells us that X has the appropriate type for
     being passed into middle_mean().
  2. len(X) == 4

  ...then we are guaranteed that `middle_mean(X)` evaluates to a value Y
  which satisfies:

  - Static type analysis can tell us that this value is a float
  - Y == (X[1] + X[2]) * 0.5 or math.isnan(Y)
===

...but the actual situation is:

  a) This is not guaranteed.
  b) PEP-484 does not seem to leave us any option to amend
     the type annotation so that we can obtain a corresponding guarantee.

Specifically, this Python3+ example breaks it - note that PEP-484
states that: "when an argument is annotated as having type float, an
argument of type int is acceptable".

 >>> middle_mean([0.0, 1.25, 10**1000, 0.0])
 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "<stdin>", line 15, in middle_mean
 OverflowError: int too large to convert to float

So, even the "this function call evaluates to..." is violated here -
it does not evaluate to anything, but raises an exception instead.

One option to address this would be to stick with the code as in
[Example 1, with mechanically added type information], but I would
argue that it is highly confusing that `float` in the static type
annotation and `float` in the docstring then refer to two very
different concepts. This is certainly not desirable (and we should not
expect Python practitioners to do it that way)!


Now, while this may superficially look like a weird edge case, that is
actually not so - the problem is that such mismatches in some
situations "allow us to drive in a wedge" and create conditions where
there is a deviation between actual and expected program
behavior. This is a common starting point for writing exploits, as for
example these classic articles nicely illustrate:

"Smashing The Stack For Fun And Profit" (aleph1@underground.org, Phrack #49)
(E.g.: https://inst.eecs.berkeley.edu/~cs161/fa08/papers/stack_smashing.pdf)

"Delivering Signals for Fun and Profit" (Michal Zalewski)
(E.g.: https://lcamtuf.coredump.cx/signals.txt)

CAPEC-71: Using Unicode Encoding to Bypass Validation Logic
https://capec.mitre.org/data/definitions/71.html

So, clearly, we want to not only maintain the ability to reason about
the behavior of Python code with mathematical stringency, but also
make this reasonably simple and straightforward. I would also argue
that, given the popularity of Python as a CS education language, it
matters that professors have a viable substrate for teaching the skill
to stringently reason about the behavior of code.

History
Date	User	Action	Args
2022-04-05 20:26:08	tfish2	set	recipients: + tfish2
2022-04-05 20:26:08	tfish2	set	messageid: <1649190368.71.0.65167976325.issue47234@roundup.psfhosted.org>
2022-04-05 20:26:08	tfish2	link	issue47234 messages
2022-04-05 20:26:08	tfish2	create