classification
Title: Unexpected behavior with * and arrays
Type: behavior Stage: resolved
Components: Versions: Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: mark.dickinson, nanthil, r.david.murray, steven.daprano
Priority: normal Keywords:

Created on 2018-05-24 14:12 by nanthil, last changed 2018-05-24 17:54 by steven.daprano. This issue is now closed.

Messages (7)
msg317572 - (view) Author: nathan rogers (nanthil) Date: 2018-05-24 14:12
https://repl.it/repls/ColorfulFlusteredPercent

Here you can see the unexpected behavior I was speaking of. This behavior is NOT useful compared to the expected behavior. If I reference position 0 in the array, I expect position 0 to be appended. The sensible behavior, from my view, would be to make n unique values, not n duplicates.
msg317580 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2018-05-24 14:38
This is not a bug, it is the documented behaviour: the * operator does not copy the lists, it duplicates references to the same list. There's even a FAQ for it:

https://docs.python.org/3/faq/programming.html#how-do-i-create-a-multidimensional-list
msg317581 - (view) Author: nathan rogers (nanthil) Date: 2018-05-24 14:49
Can anyone give me a legitimate answer as to why this would be expected behavior? When at any point would you ever need that? 

If the list is local, you already have the thing. If it isn't local, you can pass it to a function by reference. So then, why would you ever need N references to the same thing?

Are you going to run out? 

Are your functions buying tickets to the reference of my thing show, and you're afraid those tickets will run out?

What is this?
msg317583 - (view) Author: nathan rogers (nanthil) Date: 2018-05-24 15:05
[[], [], [], [], []] 

How is it expected behavior  in python, that

when I update position 0, 

it decides to update positions 1-infinity as well?

That is nonsense, and there is not a use case for this behavior. If you have already created the value, you have the value locally, and don't need N-REFERENCES to that thing. When calling functions as well, there will never be a time when you need more than 1 reference to the thing. 

How is this useful, and in what context could this ever be intuitive? If this is not a bug, it countermands the zen of python on almost every alternate line.
msg317590 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2018-05-24 17:25
@nanthil: If you want to discuss the reasons behind this design decision further, I'd suggest asking on one of the mailing lists, e.g. https://mail.python.org/mailman/listinfo/python-list

This is not the right forum for this discussion. Please don't re-open this issue.
msg317593 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2018-05-24 17:42
I wrote up a response before Mark closed the issue, so despite his excellent no discussion suggestion I'm going to post it for the edification of anyone reading the issue later rather than waste the work :)

Nathan: this is *long* established behavior of python.  It is baked in to the language.  Even if we thought it was a good idea to change it, that cannot be done for backward compatibility reasons.

As for why it works the way it does, consider the following (potentially useful!) cases:

  > [1, 2] * 5
  [1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
  > [(1, 2)] * 5
  [(1, 20), (1, 20), (1, 20), (1, 20), (1, 20)]

What Python is doing is filling the created list with *copies of the pointers to the listed values*, which is much more sensible than creating, say, multiple copies in memory of the integers 1 and 2.  That is, you are observing a specific, non-intuitive and rarely useful result of a general rule that *is* useful and intuitive.  Also note that *even if* we wanted to try to make exceptions for "mutable" verses "non-mutable" elements when doing this replication, we can't, because there's a difference between 'copy' and 'deepcopy' in Python, and Python refuses to guess.  So, if you want copies of a list, *you* have to make the correct kind of copies for your application.  Python just copies the pointers like it does for every other type of object multiplied into a list.

By the way, when a core dev closes an issue, the convention is that you can present an argument for it being reopened, but you don't reopen it yourself.  (No way for you to know that that is our convention, which is why I'm telling you :)  

But as should be clear by now, this is a closed issue and further discussion here would be counter-productive for all of us.
msg317596 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2018-05-24 17:54
Nathan, the bug tracker is not the place to debate Python behaviour. For 
the purposes of the bug tracker, all we need say is that it is 
documented behaviour and not a bug. If you want to change that 
behaviour, there is a process to follow, and asking snarky questions on 
the tracker isn't part of it.

The principle of having multiple references to the same object is 
fundamental to Python, and very often useful. It's how objects are 
passed to functions, it is used for many forms of shared data. Your 
description of object sharing as "nonsense" and having no use-case is 
way off the mark.

But if it makes you feel better, the SPECIFIC example you ran into:

    [[]]*5  # makes 5 references to the same [] object

is rarely directly useful itself. It is certainly a "gotcha" that most 
Python programmers will stumble against at one time or another. But the 
behaviour follows from some fundamental designs of the language.

Copying objects is expensive, and often unnecessary. The Python 
interpreter does not automatically make copies of objects. The 
list.__mul__ method cannot know whether you require shallow copies, or 
deep copies, and for the majority of use-cases for list replication, 
copying would be unnecessary. So the * operator simply duplicates 
references. If you want copies, you have to copy the objects yourself.
History
Date User Action Args
2018-05-24 17:54:17steven.dapranosetmessages: + msg317596
2018-05-24 17:42:07r.david.murraysetnosy: + r.david.murray
messages: + msg317593
2018-05-24 17:25:26mark.dickinsonsetstatus: open -> closed


messages: + msg317590
nosy: + mark.dickinson
2018-05-24 15:05:43nanthilsetstatus: closed -> open

messages: + msg317583
2018-05-24 14:49:37nanthilsetmessages: + msg317581
2018-05-24 14:38:38steven.dapranosetstatus: open -> closed

nosy: + steven.daprano
messages: + msg317580

resolution: not a bug
stage: resolved
2018-05-24 14:12:24nanthilcreate