This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Document that CPython accepts "invalid" identifiers
Type: Stage: patch review
Components: Documentation Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Windson Yang, andrei.avk, chris.jerdonek, docs@python, nedbat, orlnub123, pablogsal, rhettinger, roysmith, serhiy.storchaka, steven.daprano, terry.reedy, xtreak
Priority: normal Keywords: patch

Created on 2018-10-29 12:24 by vstinner, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 11263 open Windson Yang, 2018-12-20 13:54
Messages (20)
msg328816 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-10-29 12:24
The Python 3 language has a strict definition for an identifier:
https://docs.python.org/dev/reference/lexical_analysis.html#identifiers

... but in practice, it's possible to create "invalid" identifiers using setattr().

Example with PyPy:

$ pypy
Python 2.7.13 (0e7ea4fe15e82d5124e805e2e4a37cae1a402d4b, Apr 12 2018, 14:50:12)
>>>> class A: pass
>>>> 
>>>> a=A()
>>>> setattr(a, "1", 2)
>>>> getattr(a, "1")
2

>>>> vars(a)
{'1': 2}
>>>> a.__dict__
{'1': 2}

>>>> a.1
  File "<stdin>", line 1
    a.1
    ^
SyntaxError: invalid syntax


The exact definition of "identifiers" is a common question. Recent examples:

* bpo-25205
* [Python-Dev] Arbitrary non-identifier string keys when using **kwargs
  https://mail.python.org/pipermail/python-dev/2018-October/155435.html

It would be nice to document the answer. Maybe in the Langage Specification, maybe in the setattr() documentation, maybe in a FAQ, maybe everywhere?
msg329034 - (view) Author: (orlnub123) * Date: 2018-11-01 02:06
I'd argue that it's an implementation detail. Documenting it might be nice as some projects such as pytest do use it but I don't think it would make sense in setattr() or getattr() since all they do (at least in this case) is assign/retrieve from the __dict__. One thing to note is that __slots__ doesn't accept them.
msg329038 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-01 02:42
> I'd argue that it's an implementation detail. Documenting it might be nice as some projects such as pytest do use it but I don't think it would make sense in setattr() or getattr() since all they do (at least in this case) is assign/retrieve from the __dict__. One thing to note is that __slots__ doesn't accept them.

Maybe it's obvious to you, but the question is asked again and again, at least once per year. So it seems like we need to document it somewhere.
msg329039 - (view) Author: Windson Yang (Windson Yang) * Date: 2018-11-01 02:45
I agreed we should document it, it' not obvious to me at least.
msg329040 - (view) Author: (orlnub123) * Date: 2018-11-01 03:19
The customizing attribute access section of the data model might be a suitable place.
msg329161 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-11-02 20:22
It is an implementation detail that some people need to know, and that is very unlikely to change.  In the pydev thread, Guido said
"
My feeling is that limiting it to strings is fine, but checking those
strings for resembling identifiers is pointless and wasteful."

We occasionally document such things in a 'CPython implementation detail' note.  I don't know the proper markup for these.  At present, I think the note should be in setattr and **kwargs docs.
msg329169 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-02 22:51
> I don't know the proper markup for these.

It's ".. impl-detail::". See for example:
https://docs.python.org/dev/library/codecs.html#standard-encodings
msg329208 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2018-11-03 21:31
This seems like a confusion of two things: identifiers are lexical elements of the language.  Attributes are not limited to identifiers.

We could add to the docs for setattr: "The attribute name does not have to be a valid identifier."   I don't know what the language guarantees about what strings are valid as attribute names.
msg329209 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2018-11-03 21:49
> In the pydev thread, Guido said "My feeling is that limiting it to strings is fine, but checking those strings for resembling identifiers is pointless and wasteful."

But in a later message, after additional discussion, he acknowledged there could be reasons to change and said, "we needn't rush this."

So if the docs do describe the current implementation, I think it should warn people that this behavior might not be subject to the same backwards compatibility guarantees as other documented behavior.
msg329215 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-11-04 02:20
Documenting something as an 'implementation detail' denies that it is a language feature and does not offer stability guarantees.
msg329289 - (view) Author: (orlnub123) * Date: 2018-11-05 10:31
I take back my previous suggestion, I agree that documenting it in setattr() (and **kwargs) is the way to go. It's obvious that you can assign anything to the __dict__, since it represents a dict, but setattr() is more ambiguous.
'Anything' was the key word for me here. For example you can assign ints to __dict__ and it won't complain but try to do the same with setattr()/getattr() and it results in an error.
msg329496 - (view) Author: Windson Yang (Windson Yang) * Date: 2018-11-09 05:00
I try to create a PR for it. Should we add 'CPython implementation detail' at the document? Because this happens at cpython as well as pypy. BTW, where should we add the document? I have two choices.

* https://docs.python.org/3/reference/datamodel.html#object.__setattr__
* https://docs.python.org/3/library/functions.html#setattr
msg331404 - (view) Author: Windson Yang (Windson Yang) * Date: 2018-12-09 02:01
Any ideas? Or I will create a PR in a week without 'CPython implementation detail'
msg331405 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2018-12-09 02:18
> Any ideas? Or I will create a PR in a week without 'CPython implementation detail'

I don't think we want to give any stability guarantees for this. Perhaps 
we should explicitly state that this is not guaranteed behaviour and may 
change in the future.

I would be happy for it to be stated as an CPython implementation 
detail. If PyPy or any other implementation happen to duplicate it, 
we're not responsible for documenting that fact.

Please go ahead and make a PR.
msg332418 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-12-24 07:42
I don't think we can mark this as an implementation detail for setattr(). The details are downstream and determined by the target object, not by setattr() itself.

Suggested wording:

'''
Note, setattr() attempts to update the object with the given attr/value pair.
Whether this succeeds and what its affect is is determined by the target object.
If an object's class defines `__slots__`, the attribute may not be writeable.
If an object's class defines property with a setter method, the *setattr()*
will trigger the setter method which may or may not actually write the attribute.
For objects that have a regular dictionary (which is the typical case), the
*setattr()* call can make any string keyed update allowed by the dictionary
including keys that aren't valid identifiers -- for example setattr(a, '1', 'one')
will be the equivalent of vars()['1'] ='one'.
This issue has little to do with setattr() and is more related to the fact that instance dictionaries can hold any valid key. In a way, it is no different than a user writing a.__dict__['1'] = 'one'. That has always been allowed and the __dict__ attribute is documented as writeable, so a user is also allowed to write `a.dict = {'1': 'one'}.
'''

In short, we can talk about this in the setattr() docs but it isn't really a setattr() issue. Also, the behavior is effectively guaranteed by the other things users are allowed to do, so there is no merit in marking this as an implementation detail. Non-identifier keys can make it into an instance dictionary via multiple paths that are guaranteed to work.
msg332420 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-12-24 08:02
FWIW, the only restriction added by setattr() is that *name* must be a string.
msg333033 - (view) Author: Windson Yang (Windson Yang) * Date: 2019-01-05 02:23
I agreed with @Raymond Hettinger, I will update the PR from your suggestion if no other ideas in next week.
msg372885 - (view) Author: Roy Smith (roysmith) Date: 2020-07-02 21:26
Just as another edge case, type() can do the same thing:

Foo = type("Foo", (object,), {"a b": 1})
f = Foo()

for example, will create a class attribute named "a b".  Maybe this actually calls setattr() under the covers, but if it's going to be documented, it should be noted in both places.
msg399549 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-08-13 17:47
It seems like the documentation is lacking and perhaps misleading in regard to attributes.

- anything hashable can be used as a key in an obj.__dict__: an int, a tuple, etc. According to __dict__ docs, all of those are attributes. There's no warning that this isn't recommended. Various tooling may break, for example dir() will break if keys are not comparable. inspect.getmembers() will break because it relies on dir(). Probably other tooling will break in some way -- this is just the two first things I tried.

Is the reason for allowing that, - only performance (obviously that's a strong enough reason in this case)? Or can this be useful in some other corner cases?

- setattr() requires a string but a string can be '1 2', '(1,2)', etc. The docs for setattr strongly imply that it's the same as dotted notation, but it's not. The non-identifier string attrs don't break dir() or anything in inspect module AFAICT. Should the docs discourage this usage? Should they suggest some use cases where this is useful?

In addition to just being confusing, I think this can create an impression for users that setattr() allows you to set 'private' or 'hidden' attrs, and setting attrs via __dict__ allows you to set even more 'private', 'top secret' attrs.

Since attributes are such a core concept in Python, it might be good to have a section that lays out all of these corner cases and reasons for them, so that it can be linked from docs for setattr(), __dict__, dir(), inspect.getmembers(), etc.
msg399566 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-08-13 22:18
In the last message I've said that according to __dict__ docs, anything in __dict__ is an attribute of respective obj. That's a bit too-strongly worded, the docs can be understood in the sense that anything that ends up in __dict__ via other mechanisms, such as dotted notation or setattr(), is an attribute.

Since direct manipulation of __dict__ is not prohibited, and no limits are set, AFAIK, on keys that can be used for __dict__, the more natural reading of the docs is that anything that can be directly set in __dict__ is also an attribute.

The only thing that would make a user doubt this reading is if he or she finds that getattr() cannot get non-string attrs, and going by its name, user would assume you can get any valid attrs using getattr().
History
Date User Action Args
2022-04-11 14:59:07adminsetgithub: 79286
2021-08-13 22:18:37andrei.avksetmessages: + msg399566
2021-08-13 17:47:17andrei.avksetnosy: + andrei.avk
messages: + msg399549
2020-07-03 09:34:58vstinnersetnosy: - vstinner
2020-07-02 21:26:59roysmithsetnosy: + roysmith
messages: + msg372885
2019-01-05 02:23:10Windson Yangsetmessages: + msg333033
2018-12-24 08:02:34rhettingersetmessages: + msg332420
2018-12-24 07:42:11rhettingersetnosy: + rhettinger
messages: + msg332418
2018-12-20 13:54:16Windson Yangsetkeywords: + patch
stage: patch review
pull_requests: + pull_request10494
2018-12-09 02:18:53steven.dapranosetmessages: + msg331405
2018-12-09 02:01:23Windson Yangsetmessages: + msg331404
2018-11-09 05:00:15Windson Yangsetmessages: + msg329496
2018-11-05 10:31:59orlnub123setmessages: + msg329289
2018-11-04 02:20:13terry.reedysetmessages: + msg329215
2018-11-03 21:49:15chris.jerdoneksetnosy: + chris.jerdonek
messages: + msg329209
2018-11-03 21:31:25nedbatsetnosy: + nedbat
messages: + msg329208
2018-11-02 22:51:57vstinnersetmessages: + msg329169
2018-11-02 20:22:01terry.reedysetnosy: + terry.reedy
messages: + msg329161
2018-11-01 03:19:54orlnub123setmessages: + msg329040
2018-11-01 02:45:14Windson Yangsetnosy: + Windson Yang
messages: + msg329039
2018-11-01 02:42:41vstinnersetmessages: + msg329038
2018-11-01 02:06:58orlnub123setnosy: + orlnub123
messages: + msg329034
2018-10-31 20:22:17pablogsalsetnosy: + pablogsal
2018-10-29 12:34:16xtreaksetnosy: + xtreak
2018-10-29 12:24:51vstinnercreate