This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: random.shuffle loses most of the elements
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.9
process
Status: closed Resolution: third party
Dependencies: Superseder:
Assigned To: Nosy List: eric.smith, rhettinger, rowan.bradley, scoder, tim.peters
Priority: normal Keywords:

Created on 2021-03-24 18:30 by rowan.bradley, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (7)
msg389482 - (view) Author: Rowan Sylvester-Bradley (rowan.bradley) Date: 2021-03-24 18:30
This issue is probably related to issue ??? but I have created it as a separate issue. When shuffle doesn't crash it sometimes (or maybe always - I haven't fully analysed this yet) looses most of the elements in the list that it is supposed to be shuffling. Here is an extract of the code that I'm using:

import io
from io import StringIO 
from lxml import etree 
import random  

filename_xml = 'MockExam5.xml'
with io.open(filename_xml, mode="r", encoding="utf-8") as xml_file:
    xml_to_check = xml_file.read()
doc = etree.parse(StringIO(xml_to_check))
exams = doc.getroot()
questions_element = exams.find("questions")
logmsg(L_TRACE, "There are now " + str(len(questions_element.findall("question"))) + " questions")
logmsg(L_TRACE, "Randomising order of questions in this exam")
random.shuffle(questions_element)
logmsg(L_TRACE, "Finished randomise")
logmsg(L_TRACE, "There are now " + str(len(questions_element.findall("question"))) + " questions")

And here is the log produced by this code:

21-03-24 18:10:11.989 line:  2057 file: D:\XPS_8700 Extended Files\Users\RowanB\Documents\My_Scripts NEW\mockexam\put_exam.py 2 There are now 79 questions
21-03-24 18:10:11.991 line:  2065 file: D:\XPS_8700 Extended Files\Users\RowanB\Documents\My_Scripts NEW\mockexam\put_exam.py 2 Randomising order of questions in this exam
21-03-24 18:10:11.992 line:  2067 file: D:\XPS_8700 Extended Files\Users\RowanB\Documents\My_Scripts NEW\mockexam\put_exam.py 2 Finished randomise
21-03-24 18:10:11.993 line:  2068 file: D:\XPS_8700 Extended Files\Users\RowanB\Documents\My_Scripts NEW\mockexam\put_exam.py 2 There are now 6 questions

How come the shuffle starts off with 79 elements, and finishes with 6?

Thanks - Rowan
msg389483 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-03-24 18:44
Same advice as issue 43616: please provide an example we can run.

This most likely a problem with how you're using lxml, and not a bug in python. But since we can't test it, we can't know for sure.
msg389485 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2021-03-24 19:30
Are you sure it's "a list"? At least print out `type(questions_element)`. `random.shuffle()` doesn't contain any code _capable_ of changing a list's length. It only does indexed accessing of the list:

...
    for i in reversed(range(1, len(x))):
        # pick an element in x[:i+1] with which to exchange x[i]
        j = randbelow(i + 1)
        x[i], x[j] = x[j], x[i]

That's about all there is to it. Note that, for this purpose, it doesn't matter want `randbelow()` does, because that function never even sees `x`.
msg389509 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-03-25 17:30
The standard library isn't at fault here.  Please file this an an LXML bug.

Reproducer:

    from lxml.etree import Element

    root = Element('outer')
    root.append(Element('zero'))
    root.append(Element('one'))
    root.append(Element('two'))
    print([e.tag for e in root])
    root[1], root[0] = root[0], root[1]
    print([e.tag for e in root])

This outputs:

   ['zero', 'one', 'two']
   ['one', 'two']

Replacing the import with:

   from xml.etree.ElementTree import Element

Gives the expected result:

   ['zero', 'one', 'two']
   ['one', 'zero', 'two']
msg389510 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-03-25 17:45
Interestingly, this isn't an LXML bug.  It is a documented difference from how the standard library works:

    https://lxml.de/tutorial.html#elements-are-lists

So, if you want use random.shuffle(), you need the standard library ElementTree instead of lxml.

This:

    from lxml.etree import Element

    root = Element('outer')
    root.append(Element('zero'))
    root.append(Element('one'))
    root.append(Element('two'))
    root[0] = root[1]
    print([e.tag for e in root])

Produces:

   ['one', 'two']
msg389518 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2021-03-25 18:45
Yes, this is neither a bug in CPython (or its stdlib) nor in lxml. It's how things work. Don't use these two together.
msg389530 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-03-25 21:22
As an immediate fix to your problem, replace this line:

    random.shuffle(questions_element)

with:

    questions = list(questions_element)
    random.shuffle(questions)
    questions_element[:] = questions
History
Date User Action Args
2022-04-11 14:59:43adminsetgithub: 87784
2021-03-25 21:22:24rhettingersetmessages: + msg389530
2021-03-25 18:45:21scodersetstatus: open -> closed
resolution: third party
messages: + msg389518

stage: resolved
2021-03-25 17:45:04rhettingersetmessages: + msg389510
2021-03-25 17:30:54rhettingersetnosy: + scoder, - skrah
messages: + msg389509
2021-03-24 22:14:18rhettingersetnosy: + skrah
2021-03-24 22:12:54rhettingersetnosy: + rhettinger
2021-03-24 19:30:00tim.peterssetnosy: + tim.peters
messages: + msg389485
2021-03-24 18:44:11eric.smithsetnosy: + eric.smith
messages: + msg389483
2021-03-24 18:34:01iritkatrielsetcomponents: + Library (Lib)
2021-03-24 18:30:46rowan.bradleycreate