Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

random.py/os.urandom robustness #41809

Closed
majid mannequin opened this issue Apr 6, 2005 · 21 comments
Closed

random.py/os.urandom robustness #41809

majid mannequin opened this issue Apr 6, 2005 · 21 comments
Assignees
Labels
stdlib Python modules in the Lib dir

Comments

@majid
Copy link
Mannequin

majid mannequin commented Apr 6, 2005

BPO 1177468
Nosy @gvanrossum, @loewis, @birkenfeld, @birkenfeld, @rhettinger
Files
  • os-urandomfd.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/birkenfeld'
    closed_at = <Date 2005-07-04.17:17:09.000>
    created_at = <Date 2005-04-06.01:03:31.000>
    labels = ['library']
    title = 'random.py/os.urandom robustness'
    updated_at = <Date 2005-07-04.17:17:09.000>
    user = 'https://bugs.python.org/majid'

    bugs.python.org fields:

    activity = <Date 2005-07-04.17:17:09.000>
    actor = 'georg.brandl'
    assignee = 'georg.brandl'
    closed = True
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2005-04-06.01:03:31.000>
    creator = 'majid'
    dependencies = []
    files = ['1665']
    hgrepos = []
    issue_num = 1177468
    keywords = []
    message_count = 21.0
    messages = ['24890', '24891', '24892', '24893', '24894', '24895', '24896', '24897', '24898', '24899', '24900', '24901', '24902', '24903', '24904', '24905', '24906', '24907', '24908', '24909', '24910']
    nosy_count = 10.0
    nosy_names = ['gvanrossum', 'loewis', 'georg.brandl', 'georg.brandl', 'rhettinger', 'jafo', 'majid', 'lcaamano', 'astrand', 'pocket_aces']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue1177468'
    versions = ['Python 2.4']

    @majid
    Copy link
    Mannequin Author

    majid mannequin commented Apr 6, 2005

    Python 2.4.1 now uses os.urandom() to seed the random
    number generator. This is mostly an improvement, but
    can lead to subtle regression bugs.

    os.urandom() will open /dev/urandom on demand, e.g.
    when random.Random.seed() is called, and keep it alive
    as os._urandomfd.

    It is standard programming practice for a daemon
    process to close file descriptors it has inherited from
    its parent process, and if it closes the file
    descriptor corresponding to os._urandomfd, the os
    module is blissfully unaware and the next time
    os.urandom() is called, it will try to read from a
    closed file descriptor (or worse, a new one opened
    since), with unpredictable results.

    My recommendation would be to make os.urandom() open
    /dev/urandom each time and not keep a persistent file
    descripto. This will be slightly slower, but more
    robust. I am not sure how I feel about a standard
    library function steal a file descriptor slot forever,
    specially when os.urandom() is probably going to be
    called only once in the lifetime of a program, when the
    random module is seeded.

    @majid majid mannequin closed this as completed Apr 6, 2005
    @majid majid mannequin assigned birkenfeld Apr 6, 2005
    @majid majid mannequin closed this as completed Apr 6, 2005
    @majid majid mannequin added the stdlib Python modules in the Lib dir label Apr 6, 2005
    @majid majid mannequin assigned birkenfeld Apr 6, 2005
    @majid majid mannequin added the stdlib Python modules in the Lib dir label Apr 6, 2005
    @majid
    Copy link
    Mannequin Author

    majid mannequin commented Apr 6, 2005

    Logged In: YES
    user_id=110477

    There are many modules that have a dependency on random, for
    instance os.tempnam(), and a program could well
    inadvertently use it before closing file descriptors.

    @jafo
    Copy link
    Mannequin

    jafo mannequin commented Apr 6, 2005

    Logged In: YES
    user_id=81797

    Just providing some feedback:

    I'm able to reproduce this. Importing random will cause
    this file descriptor to be called. Opening urandom on every
    call could lead to unacceptable syscall overhead for some.
    Perhaps there should be a "urandomcleanup" method that
    closes the file descriptor, and then random could get the
    bytes from urandom(), and clean up after itself?

    Personally, I only clean up the file descriptors I have
    allocated when I fork a new process. On the one hand I
    agree with you about sucking up a fd in the standard
    library, but on the other hand I'm thinking that you just
    shouldn't be closing file descriptors for stuff you'll be
    needing. That's my two cents on this bug.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Apr 6, 2005

    Logged In: YES
    user_id=21627

    To add robustness, it would be possible to catch read errors
    from _urandomfd, and try reopening it if it got somehow closed.

    @jafo
    Copy link
    Mannequin

    jafo mannequin commented Apr 6, 2005

    Logged In: YES
    user_id=81797

    Yeah, I was thinking the same thing. It doesn't address the
    consumed file handle, but it does address the "robustness"
    issue. It complicates the code, but should work.

    @majid
    Copy link
    Mannequin Author

    majid mannequin commented Apr 6, 2005

    Logged In: YES
    user_id=110477

    Unfortunately, catching exceptions is not sufficient - the
    file descriptor may have been reassigned. Fortunately in my
    case, to a socket which raised ENOSYS, but if it had been a
    normal file, this would have been much harder to trace
    because reading from it would cause weird errors for readers
    of the reassigned fd without triggering an exception in
    os.urandom() itself.

    As for not closing file descriptors you haven't opened
    yourself, if the process is the result of a vfork/exec (in
    my case Python processes started by a cluster manager, sort
    of like init), the child process has no clue what file
    descriptors, sockets or the like it has inherited from its
    parent process, and the safest course is to close them all.
    Indeed, that's what W. Richard Stevens recommends in
    "Advanced Programming for the UNIX environment".

    As far as I can tell, os.urandom() is used mostly to seed
    the RNG in random.py, and thus is not a high-drequency
    operation. It is going to be very hard to document this
    adequately for coders to defend against - in my case, the
    problem was being triggered by os.tempnam() from within
    Webware's PSP compiler. There are so many functions that
    depend on random (sometimes in non-obvious ways), you can't
    flag them all so their users know they should use
    urandomcleanup.

    One possible solution would be for os.py to offer a
    go_daemon() function that implements the fd closing, signal
    masking, process group and terminal disassociation required
    by true daemons. This function could take care of internal
    book-keeping like calling urandomcleanup.

    @jafo
    Copy link
    Mannequin

    jafo mannequin commented Apr 6, 2005

    Logged In: YES
    user_id=81797

    The child is a copy of the parent. Therefore, if in the
    parent you open a few file descriptors, those are the ones
    you should close in the child. That is exactly what I've
    done in the past when I forked a child, and it has worked
    very well.

    I suspect Stevens would make an exception to his guideline
    in the event that closing a file descriptor results in
    library routine failures.

    @lcaamano
    Copy link
    Mannequin

    lcaamano mannequin commented Apr 19, 2005

    Logged In: YES
    user_id=279987

    We're facing this problem. We're thinking of patching our os.py
    module to always open /dev/urandom on every call. Does
    anybody know if this would have any bad consequences other
    than the obvious system call overhead?

    BTW, here's the traceback we get. As you probably can guess,
    something called os.urandom before we closed all file descriptors
    in the daemonizing code and it then failed when os.urandom
    tried to use the cached fd.

     Traceback (most recent call last):
       File  "/opt/race/share/sw/common/bin/dpmd", line 27, in ?
     dpmd().run()
       File  "Linux/CommandLineApp.py", line 336, in run
       File  "Linux/daemonbase.py", line 324, in main
       File  "Linux/server.py", line 61, in addServices
       File  "Linux/dpmd.py", line 293, in __init__
       File  "Linux/threadutils.py", line 44, in start
       File  "Linux/xmlrpcd.py", line 165, in createThread
       File  "Linux/threadutils.py", line 126, in __init__
       
    File  "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t
    empfile.py", line 423, in NamedTemporaryFile
     dir = gettempdir()
       
    File  "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t
    empfile.py", line 262, in gettempdir
     tempdir =  _get_default_tempdir()
       
    File  "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t
    empfile.py", line 185, in _get_default_tempdir
     namer =  _RandomNameSequence()
       
    File  "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t
    empfile.py", line 121, in __init__
     self.rng = _Random()
       
    File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/r
    andom.py", line 96, in __init__
     self.seed(x)
       
    File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/r
    andom.py", line 110, in seed
     a =  long(_hexlify(_urandom(16)), 16)

    File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/
    os.py", line 728, in urandom
    bytes += read(_urandomfd, n - len(bytes))

    OSError : [Errno 9] Bad file descriptor

    @majid
    Copy link
    Mannequin Author

    majid mannequin commented Apr 19, 2005

    Logged In: YES
    user_id=110477

    Modifying the system os.py is not a good idea. A better
    work-around is to skip the /dev/urandom fd when you are
    closing all fds. This is the code we use:

    def close_fd():
      # close all inherited file descriptors
      start_fd = 3
      # Python 2.4.1 and later use /dev/urandom to seed the
    random module's RNG
      # a file descriptor is kept internally as os._urandomfd
    (created on demand
      # the first time os.urandom() is called), and should not
    be closed
      try:
        os.urandom(4)
        urandom_fd = getattr(os, '_urandomfd', None)
      except AttributeError:
        urandom_fd = None
      if '-close_fd' in sys.argv:
        start_fd = int(sys.argv[sys.argv.index('-close_fd') + 1])
      for fd in range(start_fd, 256):
        if fd == urandom_fd:
          continue
        try:
          os.close(fd)
        except OSError:
          pass

    @gvanrossum
    Copy link
    Member

    Logged In: YES
    user_id=6380

    I recommend to close this as invalid. The daemonization code
    is clearly broken.

    @jafo
    Copy link
    Mannequin

    jafo mannequin commented Apr 19, 2005

    Logged In: YES
    user_id=81797

    Perhaps the best way to resolve this would be for the
    standard library to provide code that either does the
    daemonize process, or at least does the closing of the
    sockets that may be done as part of the daemonize, that way
    it's clear what the "right" way is to do it. Thoughts?

    @lcaamano
    Copy link
    Mannequin

    lcaamano mannequin commented Apr 20, 2005

    Logged In: YES
    user_id=279987

    Clearly broken? Hardly.

    Daemonization code is not the only place where it's recommend
    and standard practice to close file descriptors.

    It's unreasonable to expect python programs to keep track of all
    the possible file descriptors the python library might cache to
    make sure it doesn't close them in all the daemonization
    routines ... btw, contrary to standard unix programming practices.

    Are there any other file descriptors we should know about?

    @jafo
    Copy link
    Mannequin

    jafo mannequin commented Apr 20, 2005

    Logged In: YES
    user_id=81797

    Conversely, I would say that it's unreasonable to expect
    other things not to break if you go through and close file
    descriptors that the standard library has opened.

    @lcaamano
    Copy link
    Mannequin

    lcaamano mannequin commented Apr 20, 2005

    Logged In: YES
    user_id=279987

    I don't think so. One of the rules in libc, the standard C library,
    is that it cannot cache file descriptors for that same reason. This
    is not new.

    @lcaamano
    Copy link
    Mannequin

    lcaamano mannequin commented Apr 20, 2005

    Logged In: YES
    user_id=279987

    Here's a reference:

    http://tinyurl.com/b8mk3

    The relevant post:

    ============================================

    On 25 Feb 2001 10:48:22 GMT Casper H.S. Dik - Network
    Security Engineer <Casper....@holland.sun.com> wrote:

    |

    Solaris at various times used a cached /dev/zero fd both for
    mapping
    | thread stacks and even one for the runtime linker.
    | The runtime linker was mostly fine, but the thread library did
    have
    | problems with people closing fds. We since added
    MAP_ANON and no
    | longer require open("/dev/zero") . THe caaching of fds was
    gotten
    | rid of before that.
    |
    | There are valid reasons to close all fds; e.g., if you really don't
    | want to inherit and (you're a daemon and don't care).
    |
    | In most cases, though, the "close all" stuff performed by shells
    | and such at statup serves no purpose. (Other than causing
    more bugs

    )

    So the dilemma is that closing fds can cause problems and
    leaving
    them open can cause problems, when a forked child does this.
    This
    seems to tell me that hiding fds in libraries and objects is a bad
    idea because processes need to know what is safe to close
    and/or
    what needs to be left open.

    ======================================

    If the python library had some module or global list of opened file
    descriptors, then it would be OK to expect programs to keep
    those open across fork/exec. Something like:

     from os import opened_fds

    And then it would be no problem to skip those when closing fds.
    Otherwise, your nice daemon code that deals with _urandom_fd
    will break later on when somebody caches another fd
    somewhere else in the standard library.

    Also, the proposed os.daemonize() function that knows about its
    own fds would also work.

    Still, the most robust solution is not to cache open fds in the
    library or perhaps catch the EBADF exception and reopen.

    There are several solutions but closing this bug as invalid doesn't
    seem an appropriate one.

    @pocketaces
    Copy link
    Mannequin

    pocketaces mannequin commented Jun 8, 2005

    Logged In: YES
    user_id=1293510

    We ran across this problem when we upgraded to 2.4. We use
    python embedded in a multi-threaded C++ process and use
    multiple subinterpreters. When a subinterpreter shuts down
    and the os module unloads, os._urandomfd is not closed
    because it is not a file object but rather just an integer.
    As such, after a while, our process had hundreds of
    dangling open file descriptors to /dev/urandom.

    I would think, at the very least, if this were a file
    object, it would be closed when the module was unloaded (the
    deallocator for fileobject closes the file). However, that
    doesn't make it any easier for those who are forking
    processes. Probably the best bet is to close it after
    reading the data. If you need a "high performance, multiple
    seek" urandom, just open /dev/urandom yourself.

    Either way, this bug is not invalid and needs to be addressed.

    My 2 cents..
    --J

    @astrand
    Copy link
    Mannequin

    astrand mannequin commented Jul 4, 2005

    Logged In: YES
    user_id=344921

    This bug is a major problem for us as well. This bug also
    breaks the subprocess module. Try, for example:

    subprocess.Popen(["ls"], close_fds=1, preexec_fn=lambda:
    os.urandom(4))

    I agree with lcaamano; the library should NOT cache a file
    descriptor by default. Correctness and robustness is more
    important than speed.

    Has anyone really been able to verify that the performance
    with opening /dev/urandom each time is a problem? If it is,
    we could add a new function to the os module that activates
    the fd caching. If you have been calling this function, you
    have indicated that you are aware of the problem and will
    not close the cached fd. Legacy code will continue to function.

    @birkenfeld
    Copy link
    Member

    Logged In: YES
    user_id=1188172

    Attaching a patch for Lib/os.py, fixing this on Unix.

    On Windows, a completely different method is used for
    urandom, so I think it is not relevant here.

    Please review.

    @rhettinger
    Copy link
    Contributor

    Logged In: YES
    user_id=80475

    I'm on Windows so cannot be of much use on the patch review.
    It looks fine to me and I agree that something ought to be
    done. MvL reviewed and posted the original patch so he may
    be better able to comment on this patch. Alternatively,
    Peter A. can review/approve it.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jul 4, 2005

    Logged In: YES
    user_id=21627

    The patch is fine, please apply - both to the trunk and to 2.4.

    @birkenfeld
    Copy link
    Member

    Logged In: YES
    user_id=1188172

    Committed as Lib/os.py r1.87, r1.83.2.3.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants