classification
Title: Simplify _RandomNameSequence
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: haypo, pitrou, rhettinger, serhiy.storchaka
Priority: normal Keywords:

Created on 2017-04-10 09:17 by serhiy.storchaka, last changed 2017-08-10 23:44 by haypo. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 1075 merged serhiy.storchaka, 2017-04-10 09:18
PR 1187 merged haypo, 2017-04-19 20:31
PR 1191 closed haypo, 2017-04-19 22:00
Messages (17)
msg291425 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-10 09:17
_RandomNameSequence was added in issue589982 when generator functions were new optional feature. _RandomNameSequence is implemented as an iterator class. Proposed patch simplifies _RandomNameSequence by implementing it as a generator function.

This is a private name, all uses of _RandomNameSequence() need only the support of the iterator protocol.
msg291429 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-10 16:16
The downside of this change:

1. The _RandomNameSequence iterator no longer has the rng field. But it isn't used in the stdlib. And since _RandomNameSequence is a private name, it shouldn't be used in third-party code.

2. The result of no longer copyable neither pickleable. This could cause a problem if it is set as an attribute of copyable of pickleable object. But it is used only as function local or module global instances in tempfile and class attribute of in multiprocessing.synchronize.SemLock and multiprocessing.heap.Arena. All these cases don't involved in copying or pickling. And in general copying and pickling the _RandomNameSequence object is a doubtful idea.

3. This makes iterating the _RandomNameSequence iterator slightly (about 6%) slower. Not a big deal, it isn't used in performance critical code. It is possible to speed up the _RandomNameSequence iterator by the same 6% by using functools.partial (functools already is imported in tempfile), but this makes the code slightly less clear, and I choose the simplicity of the code.
msg291515 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-11 20:40
Merged in f50354adaaafebe95ad09d09b825804a686ea843.
msg291535 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-12 09:25
After reviewing the change, I created the issue #30051: "Document that the random module doesn't support fork".
msg291657 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-14 12:52
The following builbot failure is probably related:

http://buildbot.python.org/all/builders/AMD64%20Windows8.1%20Non-Debug%203.x/builds/623/steps/test/logs/stdio

======================================================================
FAIL: test_main (test.test_threadedtempfile.ThreadedTempFileTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\buildarea\3.x.ware-win81-release\build\lib\test\test_threadedtempfile.py", line 58, in test_main
    self.assertEqual(errors, [], msg)
AssertionError: Lists differ: ['Thread-813Traceback (most recent call la[2133 chars]g\n'] != []

First list contains 4 additional elements.
First extra element 0:
'Thread-813Traceback (most recent call last):\n  File "D:\\buildarea\\3.x.ware-win81-release\\build\\lib\\test\\test_threadedtempfile.py", line 38, in run\n    f = tempfile.TemporaryFile("w+b")\n  File "D:\\buildarea\\3.x.ware-win81-release\\build\\lib\\tempfile.py", line 538, in NamedTemporaryFile\n    (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)\n  File "D:\\buildarea\\3.x.ware-win81-release\\build\\lib\\tempfile.py", line 246, in _mkstemp_inner\n    name = next(names)\nValueError: generator already executing\n'

Diff is 2461 characters long. Set self.maxDiff to None to see it. : Errors: errors 4 ok 996
Thread-813Traceback (most recent call last):
  File "D:\buildarea\3.x.ware-win81-release\build\lib\test\test_threadedtempfile.py", line 38, in run
    f = tempfile.TemporaryFile("w+b")
  File "D:\buildarea\3.x.ware-win81-release\build\lib\tempfile.py", line 538, in NamedTemporaryFile
    (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
  File "D:\buildarea\3.x.ware-win81-release\build\lib\tempfile.py", line 246, in _mkstemp_inner
    name = next(names)
ValueError: generator already executing

Thread-814Traceback (most recent call last):
  File "D:\buildarea\3.x.ware-win81-release\build\lib\test\test_threadedtempfile.py", line 38, in run
    f = tempfile.TemporaryFile("w+b")
  File "D:\buildarea\3.x.ware-win81-release\build\lib\tempfile.py", line 538, in NamedTemporaryFile
    (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
  File "D:\buildarea\3.x.ware-win81-release\build\lib\tempfile.py", line 246, in _mkstemp_inner
    name = next(names)
ValueError: generator already executing

Thread-819Traceback (most recent call last):
  File "D:\buildarea\3.x.ware-win81-release\build\lib\test\test_threadedtempfile.py", line 38, in run
    f = tempfile.TemporaryFile("w+b")
  File "D:\buildarea\3.x.ware-win81-release\build\lib\tempfile.py", line 538, in NamedTemporaryFile
    (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
  File "D:\buildarea\3.x.ware-win81-release\build\lib\tempfile.py", line 246, in _mkstemp_inner
    name = next(names)
ValueError: generator already executing

Thread-823Traceback (most recent call last):
  File "D:\buildarea\3.x.ware-win81-release\build\lib\test\test_threadedtempfile.py", line 38, in run
    f = tempfile.TemporaryFile("w+b")
  File "D:\buildarea\3.x.ware-win81-release\build\lib\tempfile.py", line 538, in NamedTemporaryFile
    (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
  File "D:\buildarea\3.x.ware-win81-release\build\lib\tempfile.py", line 246, in _mkstemp_inner
    name = next(names)
ValueError: generator already executing
msg291658 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-14 13:26
Yes, it is related. :(

If there is no easy way to make a generator working in multithread environment we need to revert this change.
msg291660 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-14 13:30
Why does _get_candidate_names() need a lock? Can't we produce enough entropy to reduce the risk of duplicate names in two threads? Maybe we should use SystemRandom instead, which would also avoid the need to testing if the pid changed.

With one generator per thread, it would prevent the bug no?
msg291902 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-19 20:41
Until a solution is found, I proposed to revert the change to "repair" buildbots.

See also the python-ideas thread: [Python-ideas] Thread-safe generators
https://mail.python.org/pipermail/python-ideas/2017-April/045427.html
msg291905 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-19 20:59
New changeset 1e62bf145b4865d03a29a5720a4eb84c321a9829 by Victor Stinner in branch 'master':
bpo-30030: Revert f50354ad (tempfile) (#1187)
https://github.com/python/cpython/commit/1e62bf145b4865d03a29a5720a4eb84c321a9829
msg291906 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-19 21:02
Thank you Victor. I was going to do this but was unable to do this because I was out of my developing environment.
msg291913 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-19 22:02
I reopen the issue to propose a different approach: https://github.com/python/cpython/pull/1191 replaces Random with SystemRandom and drops get_candidate_names().

@Serhiy: I hesitated to add you as a co-author, but I'm not sure that you like my approach :-) Tell me what do you think. At least, I reused your docstring and your updated unit test!
msg291914 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-19 22:07
This isn't a different approach and this doesn't work because generators are not thread-safe.
msg291916 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-19 22:11
> Remoe _get_candidate_names() and the global _name_sequence

I understood the rationale behind _get_candidate_names() is to prevent generating the same name when _RandomNameSequence() uses random.Random. Since my change uses random.SystemRandom, I consider that it's now ok to create multiple generators in paralle. Is that rationale correct?
msg291917 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-19 22:15
test_threadedtempfile passed on Linux, but randomly failed on Windows. :( And _RandomNameSequence is used not only in tempfile.

I suggest to wait until add a wrapper that makes generators thread-safe. Since it added we could restore the generator implementation of _RandomNameSequence.
msg291950 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-20 06:37
On Linux creating a temporary file or directory usually consumes only one random name. But due to a bug on Windows (issue22107) it can consume up to TMP_MAX (2147483647 on Windows) names when called with read-only directory. Generating every name consumes about 16 random bytes. This can exhaust the system entropy and slowdown other applications.
msg291968 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-20 12:18
"Generating every name consumes about 16 random bytes. This can exhaust the system entropy and slowdown other applications."

Crys and Alex_Gaynor confirmed me on IRC that these two assumptions are both wrong.

See for example https://www.2uo.de/myths-about-urandom/

Q: But that's good! /dev/random gives out exactly as much randomness as it has entropy in its pool. /dev/urandom will give you insecure random numbers, even though it has long run out of entropy.

A:  Fact: No. Even disregarding issues like availability and subsequent manipulation by users, the issue of entropy “running low” is a straw man. About 256 bits of entropy are enough to get computationally secure numbers for a long, long time. 

--

About performance, well, it's not exactly "wrong" but "inaccurate". Abusing /dev/urandom only hurt other applications which also abuse /dev/urandom. Such use case is very unlikely.

* The bad performance of concurrent /dev/urandom reader was analyzed by an old article of 2014, but see comments:
  http://drsnyder.us/2014/04/16/linux-dev-urandom-and-concurrency.html
* The performance issue was fixed in Linux 4.8, https://github.com/torvalds/linux/commit/1e7f583af67be4ff091d0aeb863c649efd7a9112
msg300129 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-08-10 23:44
No consensus was found on this issue how to "simplify" _RandomNameSequence, so I close the issue.
History
Date User Action Args
2017-08-10 23:44:35hayposetstatus: open -> closed
resolution: rejected
messages: + msg300129
2017-04-20 12:18:33hayposetmessages: + msg291968
2017-04-20 06:37:55serhiy.storchakasetmessages: + msg291950
2017-04-19 22:15:53serhiy.storchakasetmessages: + msg291917
2017-04-19 22:11:20hayposetmessages: + msg291916
2017-04-19 22:07:28serhiy.storchakasetmessages: + msg291914
2017-04-19 22:02:09hayposetstatus: closed -> open
resolution: rejected -> (no value)
messages: + msg291913
2017-04-19 22:00:02hayposetpull_requests: + pull_request1319
2017-04-19 21:02:35serhiy.storchakasetstatus: open -> closed
resolution: rejected
messages: + msg291906
2017-04-19 20:59:54hayposetmessages: + msg291905
2017-04-19 20:41:18hayposetmessages: + msg291902
2017-04-19 20:31:41hayposetpull_requests: + pull_request1316
2017-04-14 13:30:51hayposetmessages: + msg291660
2017-04-14 13:26:49serhiy.storchakasetmessages: + msg291658
2017-04-14 12:52:46hayposetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg291657
2017-04-12 09:25:27hayposetmessages: + msg291535
2017-04-11 20:40:51serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg291515

stage: patch review -> resolved
2017-04-10 16:16:40serhiy.storchakasetmessages: + msg291429
2017-04-10 09:18:19serhiy.storchakasetpull_requests: + pull_request1219
2017-04-10 09:17:38serhiy.storchakacreate