This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Using getrandbits() in uuid.uuid4() is faster and more readable
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, mattchaput, pitrou, rhettinger, serhiy.storchaka, terry.reedy, vstinner
Priority: normal Keywords: patch

Created on 2011-09-15 16:30 by mattchaput, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
fastuuid4.patch mattchaput, 2011-09-15 16:30 A patch to use random.getrandbits() instead of os.urandom in uuid.uuid4() review
Messages (8)
msg144087 - (view) Author: Matt Chaput (mattchaput) Date: 2011-09-15 16:30
Currently the 'uuid' module uses os.urandom (in the absence of a system UUID generation function) to generate random UUIDs in the uuid.uudi4() function. This patch changes the implementation of uuid4() to use random.getrandbits() as the source of randomness instead, for the following reasons:

* In my quick tests, using getrandbits() is much faster on Windows and Linux. Some applications do need to generate UUIDs quickly.

  >>> setup = "import uuid, os, random"
  >>> ur = "uuid.UUID(bytes=os.urandom(16), version=4)"
  >>> grb = "uuid.UUID(int=random.getrandbits(128), version=4)"
  >>> # Windows --------
  >>> timeit.Timer(ur, setup).timeit()
  22.861042160383903
  >>> timeit.Timer(grb, setup).timeit()
  3.8689128309085135
  >>> # Linux --------
  >>> timeit.Timer(ur, setup).timeit()
  29.32686185836792
  >> timeit.Timer(grb, setup).timeit()
  3.7429409027099609

* The patched code is cleaner. It avoids the try...finally required by the possibly unavailable os.urandom function, and the fallback to generating random bytes.
msg144156 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-09-16 20:32
You could help this along by both running Lib.test.test_uuid with your patch applied and reporting that it passes.

Raymond, I added you because this is about changing random functions.

Side note: This code in test_uuid.test_uuid4()

        uuids = {}
        for u in [uuid.uuid4() for i in range(1000)]:
            uuids[u] = 1
        equal(len(uuids.keys()), 1000)

could be updated to use sets rather than a fake dict:

        uuids = set()
        for u in [uuid.uuid4() for i in range(1000)]:
            uuids.add(u)
        equal(len(uuids), 1000)
msg144159 - (view) Author: Matt Chaput (mattchaput) Date: 2011-09-16 20:53
Passed all tests OK.
msg157744 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-04-07 17:56
>         uuids = set()
>         for u in [uuid.uuid4() for i in range(1000)]:
>             uuids.add(u)

        uuids = {uuid.uuid4() for i in range(1000)}

However, I'm not sure of the legitimacy of replacement suitable for cryptographic use `os.urandom` on fast pseudo-random `random.getrandbits`. Especially for applications that need to generate a lot of uuids.
msg157755 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-04-07 20:06
> However, I'm not sure of the legitimacy of replacement suitable for
> cryptographic use `os.urandom` on fast pseudo-random
> `random.getrandbits`. Especially for applications that need to generate 
> a lot of uuids.

Agreed. urandom() is supposed to incorporate "real" random, while getrandbits() uses a PRNG.
Also, as the OP shows, it's easy to inject your own random source:

  >>> grb = "uuid.UUID(int=random.getrandbits(128), version=4)"

if you really need the speed.
msg172216 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-10-06 16:28
I suggest to reject this proposal because of a deterioration in security.
msg172259 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2012-10-06 22:58
If somebody is going to implement uuid based on the random module that somebody must take care of fork. Currently the PRGN in random isn't reseeded during fork(). This would lead to 100% collisions. The tempfile module contains a workaround for the issue.

I'm -1, see #15206.
msg172260 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-10-06 23:00
Ok, let's reject the issue then.
History
Date User Action Args
2022-04-11 14:57:21adminsetgithub: 57195
2012-10-06 23:00:22pitrousetstatus: open -> closed
resolution: rejected
messages: + msg172260

stage: patch review -> resolved
2012-10-06 22:59:00christian.heimessetnosy: + christian.heimes
messages: + msg172259
2012-10-06 16:28:22serhiy.storchakasetmessages: + msg172216
2012-04-07 20:06:38pitrousetversions: + Python 3.3, - Python 3.2
nosy: + pitrou

messages: + msg157755

stage: patch review
2012-04-07 17:56:03serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg157744
2011-09-16 20:53:18mattchaputsetmessages: + msg144159
2011-09-16 20:32:59terry.reedysetnosy: + rhettinger, terry.reedy

messages: + msg144156
versions: + Python 3.2
2011-09-15 17:13:19vstinnersetnosy: + vstinner
2011-09-15 16:30:16mattchaputcreate