This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: random produces different output on different architectures
Type: behavior Stage:
Components: Documentation Versions: Python 3.1
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: loewis, mark.dickinson, michael.foord, pitrou, rhettinger, terrence
Priority: normal Keywords:

Created on 2010-02-09 00:15 by terrence, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (12)
msg99078 - (view) Author: Terrence Cole (terrence) Date: 2010-02-09 00:15
This code:
>>> random.seed(b'foo')
>>> random.getrandbits(8)
...repeated 7 more times...

Yields the sequence of values:
amd64: 227, 199,  34, 218,  83, 115, 236, 254
x86:   245, 198, 204,  66, 219,   4, 168,  93

Comments in the source seem to indicate random should produce the same results on all platforms.

I first thought that the seed was not resetting the state correctly, however, if I do a 'random.setstate( (3,(0,)*625,None) )' before seeding the generator, the results do not change from what is given above.  Also, calls to getrandbits after the setstate, but before another seed, correctly return 0.  It seems from this that seed is resetting the state properly, but some of the internals are not 32bit/64bit consistent.
msg99085 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-02-09 03:17
Would you like to work on a patch?
msg99087 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-02-09 06:52
Hmm, this may be difficult to fix without breaking somebody's expectation of repeating sequences they've already generated.

The code is in random_getrandbits():

http://svn.python.org/view/python/trunk/Modules/_randommodule.c?revision=72344&view=markup
msg99105 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-02-09 11:16
It's not only getrandbits():

** x86 **
>>> random.seed(b'foo')
>>> random.random()
0.95824312997798622

** x86_64 **
>>> random.seed(b'foo')
>>> random.random()
0.88694660946819537
msg99106 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-02-09 11:18
It works when seeding from a single integer, though:
>>> import random
>>> random.seed(123)
>>> random.random()
0.052363598850944326

So I guess it's the seeding-from-an-array which is buggy.
msg99107 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-02-09 11:25
Ok, it's simple really. When seeding from something else than an integer, seed() takes the hash of the object (instead of considering all its bytes, which might be considered a weakness since you lose entropy -- also, Python hash() is not supposed to be cryptographically strong). The hash is different in 32-bit and 64-bit mode (although the lower 32 bits are the same, at least for a bytes object), and since all the bits are taken into account the initial state is different.

So the easy workaround for the OP is to seed with an integer rather a bytes object.
msg99109 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2010-02-09 11:31
If we aren't going to fix it, should we document the limitation?
msg99110 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-02-09 11:43
Well, ideally we should drop the automatic hash() and only accept:
1) ints/longs
2) buffer-like objects

(and tell people to hash() explicitly if they want to)

If that's too disruptive, we should document it.
And, for 3.x, provide the following recipe to hash from a bytes object without losing entropy, and keeping the same results under 32-bit and 64-bit builds:

>>> import random
>>> random.seed(int.from_bytes(b'foo', 'little'))
>>> random.random()
0.08384169414918807
msg99111 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-02-09 12:02
[Antoine]
> >>> random.seed(int.from_bytes(b'foo', 'little'))

+1 for either documenting this useful trick, or modifying init_by_array to do this automatically for buffer-like objects.

Disallowing other types of input for the seed sounds dangerous, though.
msg99123 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-02-09 16:40
I will update the documentation.
msg99157 - (view) Author: Terrence Cole (terrence) Date: 2010-02-10 07:53
Thank you for all the help!  I'm glad to see that the use of hash() on buffer compatible objects is an actual gotcha and not just me being obtuse.

Also, for those googling who, like me, won't be able to use 3.2's from_bytes until 3.2 is released in December, here is code to convert a bytes to an int:
>>> n = 0
>>> off = 0
>>> data = b'\xfc\x00'
>>> for c in data[::-1]:
...     n += c << off
...     off += 8
... 
>>> print(n)
64512

Or, if you prefer the functional style:
>>> sum([c<<off for c,off in zip(data[::-1],range(0,len(data)*8+1,8))])
64512
msg115740 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-09-07 04:51
Fixed in r84574 and r84576.  The seed function no longer uses hash() for str, bytes, or bytearray arguments.
History
Date User Action Args
2022-04-11 14:56:57adminsetgithub: 52137
2010-09-07 04:51:36rhettingersetstatus: open -> closed
resolution: fixed
messages: + msg115740
2010-02-10 07:53:18terrencesetmessages: + msg99157
2010-02-09 16:40:09rhettingersetnosy: loewis, rhettinger, mark.dickinson, pitrou, michael.foord, terrence
messages: + msg99123
components: + Documentation, - Library (Lib)
versions: - Python 2.6, Python 2.7, Python 3.2
2010-02-09 12:02:32mark.dickinsonsetmessages: + msg99111
2010-02-09 11:43:49pitrousetmessages: + msg99110
versions: + Python 2.6, Python 2.7, Python 3.2
2010-02-09 11:31:54michael.foordsetnosy: + michael.foord
messages: + msg99109
2010-02-09 11:25:23pitrousetmessages: + msg99107
2010-02-09 11:18:46pitrousetmessages: + msg99106
2010-02-09 11:16:39pitrousetnosy: + pitrou
messages: + msg99105
2010-02-09 10:57:38mark.dickinsonsetnosy: + mark.dickinson
2010-02-09 06:52:12rhettingersetassignee: rhettinger

messages: + msg99087
nosy: + rhettinger
2010-02-09 03:17:13loewissetnosy: + loewis
messages: + msg99085
2010-02-09 00:15:51terrencecreate