uuid.uuid1() is too slow #50135

wangchun · 2009-04-30T10:13:35Z

BPO	5885
Nosy	@vstinner, @avassalotti, @bitdancer, @serhiy-storchaka
PRs	bpo-5885: fix slow uuid initialization #3684
Superseder	bpo-11063: Rework uuid module: lazy initialization and add a new C extension
Files	uuid_c_module.patch issue_5885.patch: Faster patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2017-09-28.13:26:05.102>
created_at = <Date 2009-04-30.10:13:35.222>
labels = ['library', 'performance']
title = 'uuid.uuid1() is too slow'
updated_at = <Date 2017-09-28.13:26:05.100>
user = 'https://bugs.python.org/wangchun'

bugs.python.org fields:

activity = <Date 2017-09-28.13:26:05.100>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2017-09-28.13:26:05.102>
closer = 'vstinner'
components = ['Library (Lib)']
creation = <Date 2009-04-30.10:13:35.222>
creator = 'wangchun'
dependencies = []
files = ['14814', '20485']
hgrepos = []
issue_num = 5885
keywords = ['patch']
message_count = 12.0
messages = ['86840', '86841', '90201', '92137', '92138', '126825', '204893', '204964', '240355', '240358', '240359', '303229']
nosy_count = 9.0
nosy_names = ['vstinner', 'dstanek', 'alexandre.vassalotti', 'wangchun', 'r.david.murray', 'thijs', 'grooverdan', 'rosslagerwall', 'serhiy.storchaka']
pr_nums = ['3684']
priority = 'normal'
resolution = 'duplicate'
stage = 'resolved'
status = 'closed'
superseder = '11063'
type = 'performance'
url = 'https://bugs.python.org/issue5885'
versions = ['Python 3.5']

wangchun · 2009-04-30T10:13:33Z

uuid.uuid1() currently uses two different ways to generate a uuid. If
the system call "uuid_generate_time" is available, uuid1() uses the
system call via the ctypes interface, otherwise, it uses pure Python
code to generate a uuid. The problem is, the C interface
"uuid_generate_time" is even slower than the Python code. The ctypes
interface is too slow. According to my test, it took 55 microseconds to
generate a uuid via ctypes interface but only 45 microseconds via the
Python code. I also tried to test the performance of the
"uuid_generate_time" C API itself. It takes C code 12 microseconds. Most
of the time were wasted on ctypes. I believe we need to drop ctypes and
write a Python extensions in C for this job.

wangchun · 2009-04-30T10:42:53Z

This is my test on another faster machine.

$ cat test.py
import sys, time, uuid
N = int(sys.argv[1])
t = time.time()
for x in xrange(N):
    uuid.uuid1()
print('%.3f microseconds' % ((time.time() - t) * 1000000.0 / N))
$ cat test.c
#include <stdio.h>
#include <sys/time.h>
#include <uuid/uuid.h>

int main(int argc, char *argv[])
{
	int i, n;
	double t1, t2;
	uuid_t uuid;
	struct timeval t;
	struct timezone tz;
	sscanf(argv[1], "%d", &n);
	gettimeofday(&t, &tz);
	t1 = (double)t.tv_sec + (double)t.tv_usec / 1000000.0;
	for (i = 0; i < n; i++) {
		uuid_generate_time(uuid);
	}
	gettimeofday(&t, &tz);
	t2 = (double)t.tv_sec + (double)t.tv_usec / 1000000.0;
	printf("%.3f microseconds\n", (t2 - t1) * 1000000.0 / n);
	return 0;
}
$ gcc -l uuid -o test test.c
$ python test.py 50000
25.944 microseconds
$ python test.py 200000
25.810 microseconds
$ python test.py 1000000
25.865 microseconds
$ ./test 50000
0.214 microseconds
$ ./test 200000
0.214 microseconds
$ ./test 1000000
0.212 microseconds
$

avassalotti · 2009-07-06T23:48:15Z

Can you provide a patch?

grooverdan · 2009-09-01T09:12:40Z

This is a slightly crude module version. The speedups were only 10%

Python 3.2a0 (py3k:74612M, Sep 1 2009, 18:11:58)
[GCC 4.3.2] on linux2

Using the same test from Wang Chun:
before:
uuid1(1000000)
101.759 microseconds

after:
uuid1(1000000)
91.663 microseconds

The delays are clearly in the _byte array copying as indicated by the
test below:
>>> import sys, time, uuid
>>> def uu(n):
...      t = time.time()
...      for x in range(n):
...         uuid._uuid_generate_time_fast()
...      print('%.3f microseconds' % ((time.time() - t) * 1000000.0 / n))
...
[72265 refs]
>>> uu(1000000)
13.157 microseconds
[72267 refs]

I would expect fixing this for the ctypes version would have a similar
speedup.

grooverdan · 2009-09-01T09:29:27Z

to prove it a bit more - ctype benchmark
>>> import ctypes, ctypes.util
>>> def uu1(n):
...      t = time.time()
...      _buffer = ctypes.create_string_buffer(16)
...      for x in range(n):
...         uuid._uuid_generate_time(_buffer)
...      print('%.3f microseconds' % ((time.time() - t) * 1000000.0 / n))
...
>>> uu1(1000000)
15.819 microseconds

rosslagerwall · 2011-01-22T08:56:17Z

Attached is a patch based on the original patch, meant to have better performance.

On my PC, this:

import sys, time, uuid

def uu(n):
    t = time.time()
    for x in range(n):
        uuid.uuid1()
    print('%.3f microseconds' % ((time.time() - t) * 1000000.0 / n))

uu(50000)

records a time of 38.5 microseconds unpatched (still using ctypes/libuuid) and a time of 16.5 microseconds afterwards.
uuid4() results in an improvement from 30 microseconds to 9 microseconds. From what I could see, what took the most time was the call to UUID() with a bytes object. That's why this patch passes in the uuid as a long.

It also fixes setup.py to check for the uuid.h header.

serhiy-storchaka · 2013-12-01T09:16:31Z

Instead hexadecimals in _long_from_uuid_t you can use _PyLong_FromByteArray.

However adding new C implemented module has hight cost. I doubt that the speed up of UUID generation is worth this cost.

avassalotti · 2013-12-01T21:22:22Z

I agree that there is a maintenance cost associated with C extension modules. However, I would certainly be glad if it allowed us to eliminate uses of ctypes in this module because ctypes is quite unsafe and doesn't work well across platforms (though it is admittedly very convenient).

bitdancer · 2015-04-09T17:26:41Z

The original report says the ctypes call is slower than the python code used as a fallback. Would it not, then, be a performance improvement just to drop the ctypes call, without creating a new C module? Creating a C module would then be a separate enhancement issue if someone thought the performance improvement was enough to justify the module. Or maybe it could live in the os module?

serhiy-storchaka · 2015-04-09T17:34:59Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uuid.uuid1() is too slow #50135

uuid.uuid1() is too slow #50135

wangchun mannequin commented Apr 30, 2009

wangchun mannequin commented Apr 30, 2009

wangchun mannequin commented Apr 30, 2009

avassalotti commented Jul 6, 2009

grooverdan mannequin commented Sep 1, 2009

grooverdan mannequin commented Sep 1, 2009

rosslagerwall mannequin commented Jan 22, 2011

serhiy-storchaka commented Dec 1, 2013

avassalotti commented Dec 1, 2013

bitdancer commented Apr 9, 2015

serhiy-storchaka commented Apr 9, 2015

serhiy-storchaka commented Apr 9, 2015

vstinner commented Sep 28, 2017

uuid.uuid1() is too slow #50135

uuid.uuid1() is too slow #50135

Comments

wangchun mannequin commented Apr 30, 2009

wangchun mannequin commented Apr 30, 2009

wangchun mannequin commented Apr 30, 2009

avassalotti commented Jul 6, 2009

grooverdan mannequin commented Sep 1, 2009

grooverdan mannequin commented Sep 1, 2009

rosslagerwall mannequin commented Jan 22, 2011

serhiy-storchaka commented Dec 1, 2013

avassalotti commented Dec 1, 2013

bitdancer commented Apr 9, 2015

serhiy-storchaka commented Apr 9, 2015

serhiy-storchaka commented Apr 9, 2015

vstinner commented Sep 28, 2017