classification
Title: Addition of getattr_static for inspect module
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: michael.foord Nosy List: Trundle, benjamin.peterson, michael.foord, ncoghlan, pitrou
Priority: normal Keywords: needs review, patch

Created on 2010-09-01 13:21 by michael.foord, last changed 2010-11-20 15:09 by michael.foord. This issue is now closed.

Files
File name Uploaded Description Edit
static.py michael.foord, 2010-11-04 12:20
test_static.py michael.foord, 2010-11-04 12:21
getattr_static.patch michael.foord, 2010-11-20 14:37
Messages (11)
msg115298 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-09-01 13:23
Tests require Python 3. Implementation works with Python 2 as well.
msg115299 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-01 13:27
+1 on the principle. This could help things like pydoc, especially on modern Web frameworks which do incredibly ugly things (per-thread global variables, descriptors executing tons of code etc.).
msg115300 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-09-01 13:30
(Reposted as text was entirely duplicated - oops.)

As discussed on python-dev, a version of getattr that does static lookups - bypassing the descriptor protocol, __getattr__, and __getattribute__. Initial implementation by Nick Coghlan, amended and tests added by me.

Phillip Eby objects to this code existing at all as it doesn't play well with proxy objects.

The purpose of getattr_static is for "passive introspection" without triggering code execution (as hasattr and getattr both do). Use cases include debugging and fetching docstrings from objects.

Caveats with the current implementation are:

Cases that will break `getattr_static`, all pathological enough not
to worry about (i.e. if you do any of these then you deserve to
have everything break anyway):

* `__dict__` existing (e.g. as a property) but not returning a
  dictionary
* classes created with `__slots__` that then have the `__slots__`
  member deleted from the class (or otherwise monkeyed with)

Cases handled incorrectly:

1. where a descriptor with a `__set__` method is shadowed by an
   instance member we return the instance member in preference to
   the descriptor, unlike `getattr`
2. types implemented in C may have neither `__dict__` nor `__slots__`,
   in this case we will be unable to find instance members and return
   the attribute descriptor instead
3. classes that inherit from a class with `__slots__` (whether or not
   they use `__slots__` themselves) will return the slot descriptor
   for instance members 'owned' by a slot on a base class
4. objects that lie about being a type by having __class__ as a
   descriptor (we traverse the mro of whatever type `obj.__class__`
   returns instead of the real type)

1 could be fixed but the code would be annoying. Is it worth fixing?

2 could be detected and where fetching an attribute from an instance
fails but an attribute descriptor is found on the type we could try
it's __get__ method. Worth it?

3 could be detected if we find a slot descriptor on a type trying
its __get__ method. Worth it?

4 could be fixed by using type everywhere instead of __class__.
We also can't use isinstance that uses __class__. If an object
is lying about __class__ then it obviously *intends* us to look
at the 'faked' version. However, this breaks the 'no code
execution' purpose of getattr_static and is inconsistent with
the rest of our behaviour. Worth fixing?

Fixing *all* of these (or any) will significantly complicate the
implimentation. 

Fetching an uninitialized instance member from an instance of a class
with __slots__ returns the slot descriptor rather than raising an
AttributeError as the descriptor does. As the slot descriptor is
a Python implementation detail perhaps we are better off propagating
the exception here. (?) On the other hand, the descriptor is
available on the class and the job of this function is to fetch
members when they are available...

I'm not aware of any other caveats / potential pitfalls. Please
point them out to me. :-)
msg115308 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2010-09-01 14:54
Just jumping in here with commentary from the side bench... I noticed you say "this does not always return the same results as dir(x)". But since dir(x) exists, perhaps it would make sense to match dir(x) as closely as possible? I.e. if dir(x) doesn't know about it, don't return it, if dir(x) does know about it, do return it? Those would be ease rules to remember for sure.
msg115310 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-09-01 15:06
Since the addition of __dir__, dir(obj) can return arbitrary values. Typically (I guess) this will be used to add dynamically created attributes that this function will fail to find - so it is *more* likely that we will fail to find something in dir than the reverse.

__dir__ could also be  used to filter non-public members that getattr(...) would find. I would find it odd that getattr finds a member that exists but this function fails. I think this function is more akin to getattr than dir.

Perhaps a better warning would be that this function may fail to find members that getattr finds?
msg115311 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-09-01 15:10
(Or vice versa - getattr_static may succeed in finding members - like descriptors that raise AttributeError when fetched - when getattr fails.)
msg115314 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2010-09-01 15:17
> Since the addition of __dir__, dir(obj) can return arbitrary values. Typically (I guess) this will be used to add dynamically created attributes that this function will fail to find - so it is *more* likely that we will fail to find something in dir than the reverse.
>
> __dir__ could also be  used to filter non-public members that getattr(...) would find. I would find it odd that getattr finds a member that exists but this function fails. I think this function is more akin to getattr than dir.

Gotcha.

> Perhaps a better warning would be that this function may fail to find members that getattr finds?

Ah, yes, and vice versa (well, just yesterday I wrote a descriptor
that always raises AttributeError :-).
msg120366 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-11-04 00:57
Updated implementation that handles instances with inherited __slots__ members and attributes from C descriptors correctly.

I think is both "good enough" and useful enough to add to inspect. (The remaining constraints are rare or pathological.)
msg120396 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-11-04 12:20
Further updated implementation. Now handles data descriptors correctly but removes the code that resolves the builtin descriptors (calling __get__ on slot and attribute descriptors).

As it was resolving some descriptors but not all, and resolving getset descriptors could still trigger execution in C extensions, Benjamin felt it was more consistent and cleaner to return descriptor objects rather than resolving them. As a bonus it makes the code shorter too.

I would add to the documentation some example code showing how to handle the descriptor if the user wants to resolve them herself. (Example code shown in the tests.)

The only remaining cases that are handled incorrectly are pathological ones. (See the notes in the tests.)
msg121650 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-11-20 14:13
Reworked as a patch, including documentation and NEWS update.
msg121661 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-11-20 15:09
Committed revision 86566.
History
Date User Action Args
2010-11-20 15:09:13michael.foordsetstatus: open -> closed
resolution: accepted
messages: + msg121661

stage: resolved
2010-11-20 14:48:32ncoghlansetversions: + Python 3.2, - Python 2.5
2010-11-20 14:37:27michael.foordsetfiles: + getattr_static.patch
versions: + Python 2.5, - Python 3.2
2010-11-20 14:37:01michael.foordsetfiles: - getattr_static.patch
2010-11-20 14:13:44michael.foordsetfiles: + getattr_static.patch
keywords: + patch
messages: + msg121650
2010-11-04 12:21:14michael.foordsetfiles: + test_static.py
2010-11-04 12:20:56michael.foordsetfiles: + static.py

messages: + msg120396
2010-11-04 12:15:34michael.foordsetfiles: - test_static.py
2010-11-04 12:15:31michael.foordsetfiles: - static.py
2010-11-04 00:57:43michael.foordsetfiles: + test_static.py
2010-11-04 00:57:17michael.foordsetfiles: + static.py

messages: + msg120366
2010-11-04 00:55:01michael.foordsetfiles: - test_static.py
2010-11-04 00:54:43michael.foordsetfiles: - static.py
2010-09-04 17:16:03gvanrossumsetnosy: - gvanrossum
2010-09-04 13:16:01Trundlesetnosy: + Trundle
2010-09-01 15:17:52gvanrossumsetmessages: + msg115314
2010-09-01 15:10:25michael.foordsetmessages: + msg115311
2010-09-01 15:06:02michael.foordsetmessages: + msg115310
2010-09-01 14:54:53gvanrossumsetnosy: + gvanrossum
messages: + msg115308
2010-09-01 13:30:45michael.foordsetmessages: + msg115300
2010-09-01 13:30:19michael.foordsetmessages: - msg115296
2010-09-01 13:27:51pitrousetnosy: + pitrou
messages: + msg115299
2010-09-01 13:23:05michael.foordsetfiles: + test_static.py

messages: + msg115298
2010-09-01 13:22:48michael.foordsetmessages: - msg115297
2010-09-01 13:22:37michael.foordsetfiles: - test_static.py
2010-09-01 13:21:55michael.foordsetfiles: + test_static.py

messages: + msg115297
2010-09-01 13:21:08michael.foordcreate