Title: setattr accepts invalid identifiers
Type: behavior Stage: resolved
Components: Documentation, Interpreter Core Versions: Python 3.6, Python 3.4, Python 3.5, Python 2.7
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: docs@python Nosy List: W deW, docs@python, eryksun, martin.panter, r.david.murray, vstinner
Priority: low Keywords:

Created on 2015-09-21 18:59 by W deW, last changed 2018-10-29 12:25 by vstinner. This issue is now closed.

Messages (12)
msg251248 - (view) Author: W deW (W deW) * Date: 2015-09-21 18:59
An identifier is defined by 

identifier ::=  (letter|"_") (letter | digit | "_")*

setattr accepts identifiers that do not meet this criterion:

Python 2.7.8 (default, Jun 30 2014, 16:08:48) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> class C(object): pass
>>> o=C()
>>> setattr(o, "$$$", True)
>>>  dir(o)
['$$$', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__in
module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__su
__', '__weakref__']
>>> o.$$$
  File "<stdin>", line 1
SyntaxError: invalid syntax
msg251257 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-09-21 20:46
Even if it's not well document, it's legit and supported by CPython, but it may not be supported by other Python implementation (ex: PyPy).
msg251263 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2015-09-21 21:41
The name can be any str/unicode string, including language keywords: 

    >>> setattr(o, 'def', 'allowed')
    >>> getattr(o, 'def')
    >>> o.def
      File "<stdin>", line 1
    SyntaxError: invalid syntax

and even an empty string:

    >>> setattr(o, '', 'mu')
    >>> getattr(o, '')

This includes instances of str and unicode subclasses, at least in CPython.
msg251265 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2015-09-21 21:51
To clarify using a unicode name in 2.x, it has to be encodable using the default encoding, sys.getdefaultencoding(). Normally this is ASCII.
msg251275 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-09-21 22:49
Previous report: Issue 14029
msg251334 - (view) Author: W deW (W deW) * Date: 2015-09-22 17:38
Thanks for the ref to issue14029. I think I see how it works. As long as the object's __dict__ accepts the attributeName as a key, it needs not be a valid string nor a string at all. Though the latter *is* checked for, and that in turn can be circumvented by adding the attribute to the __dict__ directly. An object can be made attribute to itself.

However, the documentation falls short here. So far, I haven't found where it defines "attribute". Is there any point in defining an attribute that cannot be addressed as an attribute if the parser doesn't allow it?

It seems to me that in order to catch programing errors early, the default behaviour should include checking the valid syntax of the attribute's name.
msg251337 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2015-09-22 18:08
This is a documentation issue and not specific to a particular version  of Python. What's the rule on version tagging in this case?
msg251345 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-09-22 21:32
Eryksun: If you mean tagging the bug report, I think we usually tag all the applicable open branches that are due for another release.

I’m not sure anything needs to be documented regarding setattr(). At best it is an implementation detail that should not be relied on, although making the implementation stricter could be a compatibility problem.

There are other places where troublesome names are allowed. One that caught my eye recently is os.sendfile(in=..., ...) is a syntax error, but you can still pass the “in” keyword via os.sendfile(**{"in": ...}, ...).
msg251437 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-09-23 18:10
I wouldn't call the sendfile case troublesome.  'in' is a keyword, so if you want to use it in function arguments, you have to pass it as a string.  Perfectly logical :)

IIRC pypy uses an optimized dictionary if there are no non-identifier keywords in the attribute __dict__.  I *think* it supports non-identifiers by falling back to a slower implementation, but I could be wrong.  I seem to remember a discussion where it was ruled that the fact that CPython's default __dict__ accepts non-identifiers is a CPython implementation detail and code should not rely on it working...but of course some code does, so we can't "fix" it :).

If I'm remembering right, and if __dict__'s permissiveness is not noted as a CPython implementation detail in the language reference, it should be, but I would expect that it is since that discussion was one of the ones that triggered the introduction such documentation notes.

But, as MvL pointed out, setattr does *not* have this restriction, even if the python implementation rejects it for default __dicts__, because an object can do anything it wants during in its __setattr__ method, and this is an important (and used in the wild!) feature of the language.
msg251438 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2015-09-23 18:48
Even MvL appears to slip up when he states "some may accept non-strings as attribute names". That would be pointless in a __setattr__ method since setattr / PyObject_SetAttr reject non-string values as 'names'.
msg251447 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-09-23 20:27
I did however make the same mistake without checking the docs or the behavior.  But the fact that I didn't look at it doesn't make the current documentation wrong :)

What change is it that you think would be beneficial?
msg328817 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-10-29 12:25
I created bpo-35105 to propose to document the behavior.
Date User Action Args
2018-10-29 12:25:09vstinnersetmessages: + msg328817
2018-10-29 06:54:40serhiy.storchakasetstatus: pending -> closed
resolution: not a bug
stage: resolved
2018-04-22 11:00:54serhiy.storchakasetstatus: open -> pending
2015-09-23 20:27:03r.david.murraysetmessages: + msg251447
2015-09-23 18:48:56eryksunsetmessages: + msg251438
2015-09-23 18:10:41r.david.murraysetnosy: + r.david.murray
messages: + msg251437
2015-09-22 21:32:48martin.pantersetmessages: + msg251345
versions: + Python 3.4, Python 3.5, Python 3.6
2015-09-22 18:08:44eryksunsetmessages: + msg251337
components: + Documentation
2015-09-22 17:38:53W deWsetmessages: + msg251334
components: - Documentation
versions: + Python 2.7
2015-09-21 22:49:59martin.pantersetnosy: + martin.panter
messages: + msg251275
2015-09-21 21:51:15eryksunsetmessages: + msg251265
2015-09-21 21:41:18eryksunsetpriority: normal -> low

assignee: docs@python
components: + Documentation
versions: - Python 2.7
nosy: + docs@python, eryksun

messages: + msg251263
2015-09-21 20:46:44vstinnersetnosy: + vstinner
messages: + msg251257
2015-09-21 18:59:20W deWcreate