Issue40682
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2020-05-19 14:32 by mrled, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Messages (3) | |||
---|---|---|---|
msg369356 - (view) | Author: Micah R Ledbetter (mrled) | Date: 2020-05-19 14:32 | |
When using the random.Random class, using the .seed() method with version=1 does not always reproduce the same results as the .seed() method did in Python 2. From the docs, I did expect this, but on closer inspection, I can't tell whether I made a bad assumption or whether there is a bug in the module. The docs state an intention of compatibility with older versions of Python: https://docs.python.org/3.9/library/random.html#notes-on-reproducibility > Most of the random module’s algorithms and seeding functions are subject to change across Python versions, but two aspects are guaranteed not to change: > > If a new seeding method is added, then a backward compatible seeder will be offered. > > The generator’s random() method will continue to produce the same sequence when the compatible seeder is given the same seed. It's not clear from the docstring in the code whether this is intended to cover Python 2.7 behavior: https://github.com/python/cpython/blob/3.9/Lib/random.py#L134 > For version 2 (the default), all of the bits are used if *a* is a str, > bytes, or bytearray. For version 1 (provided for reproducing random > sequences from older versions of Python), the algorithm for str and > bytes generates a narrower range of seeds. But the results I've spot checked sometimes do match the Python 2 results, and sometimes are the Python 2 result +1. I wrote a python script that calls the .seed() method with version=1 under Python 3, and without a version= argument under Python 2. It uses a wordlist I happen to have in /usr/share/dict that I copied to $PWD. #!/usr/bin/env python import os, random, sys mydir = os.path.dirname(os.path.abspath(__file__)) r = random.Random() maxidx = None with open('{}/web2'.format(mydir)) as webdict: for idx, raw_word in enumerate(webdict.readlines()): word = raw_word.strip() if sys.version_info[0] == 2: r.seed(word) elif sys.version_info[0] == 3: r.seed(word, version=1) else: raise Exception("Unexpected python version") print("{}: {}".format(word, r.randrange(0, 65535, 1))) if maxidx != None and idx >= maxidx: break I also wrote a shell script to run my Python script with the Python versions I happen to have installed locally, along with Python 2.7 and 3.4-3.9 in the ci-image Docker container linked from the Python download page. #!/bin/sh set -eux mkdir -p results /usr/bin/python test.py > results/macos10.15.4.system.python2.7.16 /Library/Frameworks/Python.framework/Versions/3.8/bin/python3 test.py > results/macos10.15.4.system.python3.8.2 docker run -v $PWD:/testpy:rw -u root -it --rm quay.io/python-devs/ci-image sh -c 'python3.9 /testpy/test.py > /testpy/results/ci-image.python3.9' docker run -v $PWD:/testpy:rw -u root -it --rm quay.io/python-devs/ci-image sh -c 'python3.8 /testpy/test.py > /testpy/results/ci-image.python3.8' docker run -v $PWD:/testpy:rw -u root -it --rm quay.io/python-devs/ci-image sh -c 'python3.7 /testpy/test.py > /testpy/results/ci-image.python3.7' docker run -v $PWD:/testpy:rw -u root -it --rm quay.io/python-devs/ci-image sh -c 'python3.6 /testpy/test.py > /testpy/results/ci-image.python3.6' docker run -v $PWD:/testpy:rw -u root -it --rm quay.io/python-devs/ci-image sh -c 'python3.5 /testpy/test.py > /testpy/results/ci-image.python3.5' docker run -v $PWD:/testpy:rw -u root -it --rm quay.io/python-devs/ci-image sh -c 'python2.7 /testpy/test.py > /testpy/results/ci-image.python2.7' I've made a github repo that contains both scripts and the results: https://github.com/mrled/random-Random-seed-version-testing I ran the script on my Mac, which means I used the system installed Python binaries that came with macOS x86_64, but the ci-image Python versions are running under an x86_64 Linux virtual machine (because of how Docker for Mac works). To summarize the results: * The Python 2.7 on my Mac works the same as the Python 2.7 on the ci-image * The Python 3.8 on my Mac works the same as Pythons 3.5-3.9 on the ci-image * Python 3.4 is different from both (although it is now unsupported anyway) A sample of the results. I haven't programmatically analyzed them, but from my spot checks, they all appear to be like this: > head results.ci-image.python2.7 | > head results.ci-image.python3.9 A: 8866 | A: 8867 a: 56458 | a: 56459 aa: 29724 | aa: 29724 aal: 11248 | aal: 11248 aalii: 16623 | aalii: 16623 aam: 62302 | aam: 62303 Aani: 31381 | Aani: 31381 aardvark: 6397 | aardvark: 6397 aardwolf: 32525 | aardwolf: 32526 Aaron: 32019 | Aaron: 32019 |
|||
msg369394 - (view) | Author: Steven D'Aprano (steven.daprano) * ![]() |
Date: 2020-05-19 21:15 | |
3.5 and 3.6 are now only accepting security fixes. Only the stability of random.random is guaranteed across versions, but you are calling randrange: https://docs.python.org/3/library/random.html#notes-on-reproducibility So I am pretty sure that this will not be considered a bug (unless it is a design bug). Personally I think that the lack of reproducibility of the full range of random methods is a rather large annoyance: if you care about reproducibility, including doctests, you cannot use anything in the module except random.random, but have to write your own implementation (possibly by copying and pasting). I don't have a good solution for this though. |
|||
msg369409 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2020-05-20 00:28 | |
The parts that are supposed to be stable are the seeding and the output of calls to random(). The sessions shown below show that this working as intended. The downstream algorithms such as randrange() are not protected by the reproducibility guarantees. While we try not to change them unnecessarily, they are allowed to change and to generate different sequences. At some point in Python 3's history, we changed randrange() so that it often gives different results than before. The reason for the change is that the old algorithm wasn't as evenly distributed as it should have been. ------ Sessions showing that the output of random() is stable ------ Python 2.7.17 (v2.7.17:c2f86d86e6, Oct 19 2019, 16:24:34) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license()" for more information. >>> import random >>> random.seed('superman123') >>> [random.random() for i in range(5)] [0.6740635277890739, 0.3455289115553195, 0.6883176146073614, 0.3824266890084288, 0.9839811707434662] Python 3.8.3 (v3.8.3:6f8c8320e9, May 13 2020, 16:29:34) [Clang 6.0 (clang-600.0.57)] on darwin Type "help", "copyright", "credits" or "license()" for more information. >>> import random >>> random.seed('superman123', version=1) >>> [random.random() for i in range(5)] [0.6740635277890739, 0.3455289115553195, 0.6883176146073614, 0.3824266890084288, 0.9839811707434662] |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:31 | admin | set | github: 84859 |
2020-05-20 00:28:33 | rhettinger | set | status: open -> closed resolution: not a bug messages: + msg369409 stage: resolved |
2020-05-19 21:15:04 | steven.daprano | set | nosy:
+ steven.daprano messages: + msg369394 versions: - Python 3.5, Python 3.6 |
2020-05-19 14:42:28 | serhiy.storchaka | set | nosy:
+ rhettinger, mark.dickinson |
2020-05-19 14:32:46 | mrled | create |