msg175812 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-11-17 21:32 |
Now random_seed() use an ineffective quadratic-time algorithm. It can reuse _PyLong_AsByteArray(), decreasing code size and increasing performance.
|
msg175820 - (view) |
Author: Martin v. Löwis (loewis) * |
Date: 2012-11-17 22:42 |
Why is this a problem?
|
msg175821 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-11-17 22:48 |
This is not a problem at all. This just a code cleanup. Removing a code duplicate. Performance boost is a side effect.
I confused by an old code comment about _PyLong_AsByteArray(). I don't see any difficulties.
|
msg175827 - (view) |
Author: Martin v. Löwis (loewis) * |
Date: 2012-11-17 23:16 |
Ah. The cleanup looks good, from a shallow glance. I haven't verified yet that the change is actually correct.
|
msg175846 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-11-18 08:32 |
Patch looks correct and looks good to me, modulo a couple of nitpicks (see Rietveld comments). This seems like a nice cleanup.
The patch introduces a new dependence on PY_UINT32_T, which is something we haven't so far used elsewhere in Python beyond the float<->string conversions. That's quite a big change: it means that it's conceivable that the random module could now fail to build on platforms where it used to work.
It also changes the signature of 'init_by_array', which is one of the core routines taken directly from the MT implementation.
|
msg175847 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-11-18 08:41 |
Are there any platforms without 32-bit integers (PY_UINT32_T can be uint32_t, unsigned int or long)? PyUCS4 also should be 32-bit, therefore Python requires such type.
|
msg175848 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-11-18 09:00 |
Patch updated to conform with Mark's nitpicks.
What I really doubt is that now same integer seed on little-endian and big-endian give different random sequences. Is this important? If yes, I can add bytes-swapping. On other hand, non-integer seed already give platform-depending results (the hashing used).
What you think about using a buffer instead a hash for bytes and bytearrays (and all other seeds supported buffer protocol)? It can increase entropy. Of course, it should be another issue if you approve it.
|
msg175849 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-11-18 09:08 |
> PyUCS4 also should be 32-bit, therefore Python requires such type.
Hmm, okay. I wonder whether PY_UINT32_T should have been used there, to avoid doing the same checks in multiple places.
> What I really doubt is that now same integer seed on little-endian and
> big-endian give different random sequences. Is this important?
Yes, I think it's important that this code change doesn't change the random sequence if the seed is unchanged, on any platform (32 / 64-bit, big versus little endian).
|
msg175850 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-11-18 09:14 |
Thanks for the updated patch. A couple more comments:
- you've got potential overflow when computing keysize from bits, on platforms where sizeof(size_t) > sizeof(unsigned long).
- please could you move the check for PY_UINT32_T nearer the top of the file, along with the other #defines.
|
msg175851 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-11-18 09:36 |
Patch updated. Now random_seed() is platform-independed for integer arguments.
|
msg175852 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-11-18 09:39 |
Oops, typo.
|
msg175881 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-11-18 15:30 |
I'm still uncomfortable with the init_by_array signature changes and the use of PY_UINT32_T. How about something like the attached instead? It keeps the central idea (use _PyLong_NumBits and _PyLong_AsByteArray) but doesn't require any signature changes or special casing for bigendian machines.
|
msg175883 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-11-18 15:55 |
I don't think that the preservation of the signature of the auxiliary private static function is worth it. I'm uncomfortable with such patch. But do as you feel comfortable.
|
msg175884 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-11-18 16:19 |
> I'm uncomfortable with such patch.
Any particular reason? It's direct and straightforward, and eliminates the quadratic behaviour.
|
msg175886 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-11-18 16:49 |
The code is larger. There is one additional allocation. CPU tacts wasted for uint32->ulong conversion (and in any case all numbers in the generator are 32-bit). One additional ValeError/OverflowError. Apparently my feeling of comfort is different from your own. ;)
Hmm, reviewing your code I found errors in my code too. I guess I'm more captious as a critic than as an author.
|
msg175887 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-11-18 17:06 |
> Apparently my feeling of comfort is different from your own. ;)
Yes: I tend to favour direct, readable, and portable code over unnecessarily optimized code. To address the specific points:
> The code is larger.
Very slightly. It's (IMO) more readable and comprehensible though.
> There is one additional allocation.
Yep. Is this a problem?
> CPU tacts wasted for uint32->ulong conversion.
random.seed is hardly going to be a bottleneck in most applications. Again, I don't see that as a problem.
> One additional ValeError/OverflowError.
That's not really additional: it should really have already been there in the original code.
|
msg175888 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-11-18 17:07 |
I agree with Mark, we don't need to micro-optimize here. Also, +1 for not needing PY_UINT32_T.
|
msg175897 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-11-18 17:45 |
> I tend to favour direct, readable, and portable code over unnecessarily optimized code.
And my feeling of directness, readability, and portability also slightly differs. I agree that code size and additional operations not of importance here. I say only that it makes me feel less comfortable. It doesn't matter.
I were left some nitpicks on Rietveld.
|
msg175899 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2012-11-18 18:09 |
Is PY_UINT32_T a big problem? I hope that one day we can use the
C99 types directly. Visual Studio finally supports stdint.h, and
I'm not aware of any compiler that does not.
Consider cdecimal as a trial balloon: It compiles on all obscure
snakebite platforms, and I'm almost willing to bet that there won't
be any reports due to the C99 types.
|
msg176341 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-11-25 11:40 |
Updated patch: adds in the missing checks for PyMem_Malloc errors.
|
msg177900 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2012-12-21 21:53 |
New changeset db75553ff333 by Mark Dickinson in branch 'default':
Simplify random_seed to use _PyLong_AsByteArray. Closes issue #16496.
http://hg.python.org/cpython/rev/db75553ff333
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:38 | admin | set | github: 60700 |
2012-12-21 21:53:37 | mark.dickinson | set | status: open -> closed resolution: fixed |
2012-12-21 21:53:08 | python-dev | set | nosy:
+ python-dev messages:
+ msg177900
|
2012-12-03 15:39:03 | asvetlov | set | nosy:
+ asvetlov
|
2012-11-26 17:13:59 | jcea | set | nosy:
+ jcea
|
2012-11-25 11:40:29 | mark.dickinson | set | files:
+ random_seed_metd2.patch
messages:
+ msg176341 |
2012-11-19 18:18:01 | mark.dickinson | set | assignee: mark.dickinson |
2012-11-18 18:09:44 | skrah | set | nosy:
+ skrah messages:
+ msg175899
|
2012-11-18 17:45:10 | serhiy.storchaka | set | messages:
+ msg175897 |
2012-11-18 17:07:38 | pitrou | set | nosy:
+ pitrou messages:
+ msg175888
|
2012-11-18 17:06:24 | mark.dickinson | set | messages:
+ msg175887 |
2012-11-18 16:49:03 | serhiy.storchaka | set | messages:
+ msg175886 |
2012-11-18 16:19:21 | mark.dickinson | set | messages:
+ msg175884 |
2012-11-18 15:55:53 | serhiy.storchaka | set | messages:
+ msg175883 |
2012-11-18 15:30:24 | mark.dickinson | set | files:
+ random_seed_metd.patch
messages:
+ msg175881 |
2012-11-18 09:39:28 | serhiy.storchaka | set | files:
+ random_seed_3.patch
messages:
+ msg175852 |
2012-11-18 09:38:26 | serhiy.storchaka | set | files:
- random_seed_3.patch |
2012-11-18 09:36:17 | serhiy.storchaka | set | files:
+ random_seed_3.patch
messages:
+ msg175851 |
2012-11-18 09:14:42 | mark.dickinson | set | messages:
+ msg175850 |
2012-11-18 09:08:38 | mark.dickinson | set | messages:
+ msg175849 |
2012-11-18 09:00:10 | serhiy.storchaka | set | files:
+ random_seed_2.patch
messages:
+ msg175848 |
2012-11-18 08:41:38 | serhiy.storchaka | set | messages:
+ msg175847 |
2012-11-18 08:32:41 | mark.dickinson | set | messages:
+ msg175846 |
2012-11-17 23:16:47 | loewis | set | messages:
+ msg175827 |
2012-11-17 22:48:38 | serhiy.storchaka | set | messages:
+ msg175821 |
2012-11-17 22:42:55 | loewis | set | nosy:
+ loewis messages:
+ msg175820
|
2012-11-17 21:32:44 | serhiy.storchaka | create | |