Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stdout redirection tool to contextlib #60009

Closed
rhettinger opened this issue Aug 29, 2012 · 36 comments
Closed

Add stdout redirection tool to contextlib #60009

rhettinger opened this issue Aug 29, 2012 · 36 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@rhettinger
Copy link
Contributor

BPO 15805
Nosy @warsaw, @birkenfeld, @rhettinger, @ncoghlan, @abalkin, @pitrou, @vstinner, @ezio-melotti, @alex, @msabramo, @vajrasky

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/rhettinger'
closed_at = <Date 2013-10-10.07:47:39.379>
created_at = <Date 2012-08-29.04:26:23.827>
labels = ['type-feature', 'library']
title = 'Add stdout redirection tool to contextlib'
updated_at = <Date 2013-10-12.16:48:47.903>
user = 'https://github.com/rhettinger'

bugs.python.org fields:

activity = <Date 2013-10-12.16:48:47.903>
actor = 'georg.brandl'
assignee = 'rhettinger'
closed = True
closed_date = <Date 2013-10-10.07:47:39.379>
closer = 'rhettinger'
components = ['Library (Lib)']
creation = <Date 2012-08-29.04:26:23.827>
creator = 'rhettinger'
dependencies = []
files = []
hgrepos = []
issue_num = 15805
keywords = []
message_count = 36.0
messages = ['169335', '169336', '169338', '169339', '169400', '171327', '171329', '184312', '193375', '193392', '193401', '193459', '193460', '193472', '193475', '193476', '193488', '193489', '193491', '193492', '193495', '193498', '193501', '193502', '193536', '193574', '193575', '193576', '193578', '193583', '193587', '193599', '199366', '199372', '199435', '199598']
nosy_count = 13.0
nosy_names = ['barry', 'georg.brandl', 'rhettinger', 'ncoghlan', 'belopolsky', 'pitrou', 'vstinner', 'ezio.melotti', 'alex', 'nikratio', 'python-dev', 'Marc.Abramowitz', 'vajrasky']
pr_nums = []
priority = 'low'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue15805'
versions = ['Python 3.4']

@rhettinger
Copy link
Contributor Author

The technique of temporarily redirecting stdout could be encapsulated in a context manager.

print('This goes to stdout')
with RedirectStdout(sys.stderr):
     print('This goes to stderr')
     print('So does this')
print('This goes to stdout')

The print function already supports redirection but it much be done for every single call to print(). The context manager let's the redirection apply to a batch of statements.

The context manager is also useful with existing tools that don't currently provide output redirection hooks:

   from collections import namedtuple
   with open('model.py', 'w') as module:
       with RedirectStdout(module):
            namedtuple('Person', ['name', 'age', 'email'], verbose=True)


   import dis
   with open('disassembly.txt', 'w') as f:
       with RedirectStdout(f):
            dis.dis(myfunc)

A possible implementation is:

class RedirectStdout:
    ''' Create a context manager for redirecting sys.stdout
        to another file.
    '''
    def __init__(self, new_target):
        self.new_target = new_target

    def __enter__(self):
        self.old_target = sys.stdout
        sys.stdout = self.new_target
        return self

    def __exit__(self, exctype, excinst, exctb):
        sys.stdout = self.old_target

@rhettinger rhettinger added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Aug 29, 2012
@ncoghlan
Copy link
Contributor

We actually use a variant of this idea in the test suite (http://docs.python.org/dev/library/test#test.support.captured_stdout)

It would be pretty easy to combine the two concepts by defaulting the redirection to a new StringIO instance if no other destination is given:

print('This goes to stdout')
with redirect_stdout(sys.stderr):
     print('This goes to stderr')
     print('So does this')
print('This goes to stdout')
with redirect_stdout() as s:
     print('This goes to the io.StringIO instance "s"')
     print('So does this')
print('This goes to stdout')

@alex
Copy link
Member

alex commented Aug 29, 2012

Sounds like a special case of a small part of mock. Not sure that this observation is significant though.

@ncoghlan
Copy link
Contributor

Sure, but by targeting a specific use case you can make it really trivial to use.

@brettcannon
Copy link
Member

So Alex's point is valid: the examples in the unittest.mock.patch docs shows how to do this (http://docs.python.org/dev/py3k/library/unittest.mock.html#patch). So this could be simplified to:

def redirect_stdout(replacement=None):
  return unittest.mock.patch('sys.stdout', replacement if replacement is not None else io.StringIO())

Now that being said, this is extremely common, so Nick's point of just going ahead and providing the helper makes sense. But I wouldn't put it in contextlib but instead in unittest.mock since it is mocking out sys.stdout.

@rhettinger
Copy link
Contributor Author

I like Nick's proposed variant and think it should go in contextlib, not the mocking library. Redirection has a lot of use cases that has nothing to do with mocking-up test suites. Contextlib is about general purpose context-managers that apply in many situations (the closing() context manager is a good example).

@ncoghlan
Copy link
Contributor

I'd actually be inclined to make it the full trio: redirect_stdin, redirect_stdout, redirect_stderr.

Mostly because I don't see an especially compelling reason to privilege redirecting stdout over the other two standard streams, and the "pass in the stream name" approach is just ugly (e.g. we don't have "sys.stdstream['stdin']", we have sys.stdin).

There are plenty of command line apps that have both -i and -o options (to select input and output files), and "2>1" is a pretty common shell redirection.

Agreed that the general purpose nature of standard stream redirection makes it a good fit for contextlib, though.

@rhettinger rhettinger assigned rhettinger and unassigned ncoghlan Mar 16, 2013
@vstinner
Copy link
Member

If such context manager is added, it should be documented that it does not work with subprocess or C functions writing directly into the file descriptor 1.

For such issues, I'm using dup2(). Example from my pysandbox project:

@contextlib.contextmanager
def capture_stdout():
    import sys
    import tempfile
    stdout_fd = sys.stdout.fileno()
    with tempfile.TemporaryFile(mode='w+b') as tmp:
        stdout_copy = os.dup(stdout_fd)
        try:
            sys.stdout.flush()
            os.dup2(tmp.fileno(), stdout_fd)
            yield tmp
        finally:
            sys.stdout.flush()
            os.dup2(stdout_copy, stdout_fd)
            os.close(stdout_copy)

@pitrou
Copy link
Member

pitrou commented Jul 19, 2013

I don't think this has anything to do in contextlib (which isn't a library of context managers, but a library of tools to work with context managers), it probably belongs in the io module instead.

@msabramo
Copy link
Mannequin

msabramo mannequin commented Jul 20, 2013

As it happens, I wrote a similar context manager to Victor's recently for a setup.py because I wanted to suppress compiler errors that are output to the console by distutils.ccompiler.CCompiler.has_function. As Victor mentioned, for this to work with subprocesses, you need to go a little more low-level and mess around with file descriptors. Here's my function:

http://marc-abramowitz.com/archives/2013/07/19/python-context-manager-for-redirected-stdout-and-stderr/

(Maybe distutils.ccompiler.CCompiler.has_function should redirect its own output automatically, but that's another issue)

But then I got to thinking that it could be made a bit more powerful and the syntax could be a little nicer. So I have this code that I'm experimenting with:

https://gist.github.com/msabramo/6043474

But critiquing my own function, I wonder if it's trying to do too much in one function and it's using keyword arguments where it could be using the with statement better. So I might like Nick's API better.

@ncoghlan
Copy link
Contributor

+1 for io as an intuitive home for a basic version that redirects the
current process output only (and is documented as doing so). While I
like the idea of offering a more sophisticated version that affects
subprocesses as well, I think that would be a very surprising thing to
do by default and should be a separate issue (perhaps proposing such
an addition to the subprocess module).

@msabramo
Copy link
Mannequin

msabramo mannequin commented Jul 21, 2013

I agree also that io is a good place for the basic version that doesn't do file descriptor stuff and maybe the fancy file descriptor stuff should be a separate issue and should go in subprocess.

To move this along and generate more discussion, I took code from Nick earlier in the thread and made a little patch and a test program and put it in a gist:

https://gist.github.com/msabramo/6049140

If folks like that, I can convert the test program into an automated test.

What do you guys think?

@msabramo
Copy link
Mannequin

msabramo mannequin commented Jul 21, 2013

Oops, Nick => Brett.

@ncoghlan
Copy link
Contributor

A good start, but:

  1. io is too low level to depend on unittest (or even contextlib), as
    anything it imports will be imported automatically at interpreter startup.
    The context manager will need to be written out directly as a class with
    the appropriate methods.

  2. The name based API should accept the unqualified name and throw a value
    error if an unexpected name is passed in.

  3. The stdin replacement should have a separate optional keyword-only
    "data" argument to request wrapping with StringIO, rather than duck typing
    the replacement value.

@msabramo
Copy link
Mannequin

msabramo mannequin commented Jul 21, 2013

Thanks Nick! I'll work on applying your suggestions a little later. And I'll add a note about it not working with subprocesses because I realized that I forgot to do that.

Regarding redirect_stdfile, which is presumably what you meant by "name-based API", I started out with unqualified names like 'stdout', etc. I went with the qualified name as an attempt at making it more general (i.e.: if folks perhaps wanted to use this for something other than sys files) - I don't know whether or not this is useful - I don't have a use case in mind (anyone else got one?) - so I don't know if qualified names brings anything useful or not but that was the thinking. Thoughts on this are welcome.

@rhettinger
Copy link
Contributor Author

I'm thinking that PrintRedirect is a better name because it coincides with the intent of the primary use case. The more computer-sciency you make the name, the more it becomes opaque to everyday users.

@ncoghlan
Copy link
Contributor

As Raymond noted, we should resist the temptation to generalise this too much - generalisation almost always comes at the cost of making any *specific* case harder (consider the difference between the complexity of the "support any URL scheme" model in urllib and urllib2 and the relatively API simplicity of the higher level HTTP/HTTPS focused requests module).

I'm wary of calling it "PrintRedirect" though, as I believe that's actively misleading: swapping sys.stdout for something else affects more than just the print builtin, and if a particular print call uses the "file" parameter, then the redirection of sys.stdout will have no effect.

"Redirection" in general is a bit of a misnomer for what the context manager is doing.

So, here's a concrete API suggestion and implementation:

    def replace_stdin(stream=None, *, data=None):
        if data is not None:
            if stream is not None:
                raise ValueError("Cannot specify both stream & data")
            stream = StringIO(data)
        return ReplaceStandardStream('stdin', stream)

    def replace_stdout(stream=None):
        return ReplaceStandardStream('stdout', stream)

    def replace_stderr(stream=None):
        return ReplaceStandardStream('stderr', stream)

    class ReplaceStandardStream:
        """Context manager to temporarily replace a standard stream
    On entry, replaces the specified sys module stream attribute
    ('stdin', 'stdout' or 'stderr') with the supplied IO stream
    object.

    On exit, restores the previous value of the sys module
    attribute.

    Note: as this context manager modifies sys module attributes
    directly, it is NOT thread-safe.
    """
    def __init__(self, attr, stream):
        if attr not in ('stdin', 'stdout', 'stderr'):
            raise ValueError("{.200!r} is not a standard stream name (expected one of: 'stdin', 'stdout', or 'stderr'".format(attr))
        self._attr_to_replace = attr
        self._old_stream = None
        if stream is None:
            self._replacement_stream = StringIO()
        else:
            self._replacement_stream = stream
        def __enter__(self):
            if self._old_stream is not None:
                raise RuntimeError("Cannot reenter {!r}".format(self))
            self._old_stream = getattr(sys, self._attr_to_replace)
            stream = self._replacement_stream
            setattr(sys, self._attr_to_replace, stream)
            return stream

        def __exit__(self):
            stream = self._old_stream
            if stream is None:
                raise RuntimeError("Never entered {!r}".format(self))
            self._old_stream = None
            setattr(sys, self._attr_to_replace, stream)

Cross linking from the print documentation to io.replace_stdout and from the input documentation to io.replace_stdin may also be a good idea.

@ncoghlan
Copy link
Contributor

And reviewing my own draft - the convenience functions would need docstrings too, and the docstrings should mention that a new StringIO instance is created by default and that "as name" can be used to get access to that object.

@rhettinger
Copy link
Contributor Author

Nick, it seems to me that you're going way over-board with an otherwise simple idea. I'm aiming for something that looks much like my original posting. That has been tried out on my students with great success.

@msabramo
Copy link
Mannequin

msabramo mannequin commented Jul 22, 2013

I like Nick's version. I don't know if __exit__ really needs error checking, but I like the API. For me, it strikes a good balance between being intuitive and being general enough to do all the stuff I'd like to do.

Should the docstrings mention that it only works within the process and doesn't affect subprocesses?

I also am having second thoughts about putting it in the io module. Now I'm thinking that sys makes more sense because that's where stdin, stdout, and stderr live.

@ncoghlan
Copy link
Contributor

OK, Raymond and I had a chat on IRC about this, and I've come back around to the idea that the simple "contextlib.redirect_stdout" model as Raymond originally proposed is the way to go. There are a few pieces to that rationale:

  1. Why in contextlib?

The io module isn't something people normally need to think about - we mostly hide it from them since they can just call the open builtin. Similarly, most new users don't need to touch the sys module - it's all hidden behind higher level abstractions (like input, print and argparse).

However, contextlib is something everyone will be exposed to, both to turn generators into context managers, but also for the utility context managers closing() and ignored().

It also comes down to trusting Raymond's opinion as an experienced Python teacher that contextlib is a more discoverable location than io for this particular piece of functionality.

  1. Why only stdout? Why not stdin and stderr?

Raymond pointed out that the case for this kind of redirection helper is much weaker for stdin and stderr. Wanting to control where "print" and similar operations write to is quite common, but writing to stderr is often specifically about bypassing typical redirections and faking input through stdin is typically only done for testing purposes.

Adding stdlib APIs without compelling use cases isn't a good idea, even when they seem like an "obvious" generalisation.

In this case, outside the CPython test suite, I couldn't think of any situation where the limited versions of these helpers would have actually been useful to me, and in the test suite case, the typical answer would be "use a mock" (that just wasn't an option for CPython until recently when unittest.mock was added to the standard library).

Instead, I now think both of those cases would be better handled by the more sophisticated API discussed above that would deal with these things at the file descriptor level. So I suggest we split that API out as a new issue targetting the subprocess API.

  1. Why not include automatic creation of StringIO objects?

Including such a default actually makes the API harder to document and explain. Composing a "you must supply a stream object" API with io.StringIO is easy to get local redirection is easy. If we support implicit creation, then we need to explain that it happens, and that the "as" clause can be used to retrieve the implicitly created object.

Raymond plans to work on a patch for this simple version of the API.

@nikratio
Copy link
Mannequin

nikratio mannequin commented Jul 22, 2013

I think stdout redirection is very useful, and I'm actually have a very similar context manager in my own code.

However, I'd like to raise the question if using a context manager for this purpose should really be "officially blessed" in this way.

To me, the with statement signals that effects are constrained to the managed block. But redirecting sys.stdout is a change with global scope - the redirection is not restricted to execution of the with block, it affects every other thread that's running at the same time. This effect is obvious if you wrote the redirection context manager yourself, but if you're using code from the standard library, this may be surprising.

I don't have a better proposal, but I just wanted to mention this...

@ncoghlan
Copy link
Contributor

Yeah, the docs will need to note that it isn't thread safe. However, non thread-safe constructs are often still incredibly useful for scripting use cases.

@vstinner
Copy link
Member

Can someone propose a patch instead of inline code?

@warsaw
Copy link
Member

warsaw commented Jul 22, 2013

In general, I like where this is going. I agree that a stdout redirector is
probably the most common case, and for that, it almost always (for me)
redirects to an open file. The use case for stderr redirection is usually,
but not always, to redirect stderr to stdout, but I agree that that is much
more limited. stdin redirection is much more rare, mostly a testing device,
and better done with mocking.

A suggestion about the name. Thinking about how it will read in a
with-statement:

    with stdout_to(some_file):
        print('hello')

Since this is all about convenience, I'd mildly suggest an API similar to
built-in open(). I'd rather write this:

    with stdout_to('/tmp/debug.log', 'w', encoding='utf-8'):
        print('hello')

than this:

    with open('/tmp/debug.log', 'w', encoding='utf-8') as tmp_stdout:
        with stdout_to(tmp_stdout):
            print('hello')

stdout_to() could optionally take a single which would be an already open file
object.

Anyway, no doubt you'll paint this bikeshed your own particular color. Mine
only cost $0.02.

@ncoghlan
Copy link
Contributor

I'd prefer to keep the separate stream argument rather than duplicating the
signature of open. Separation of concerns and all that :)

@abalkin
Copy link
Member

abalkin commented Jul 23, 2013

It would be nice if this context manager had an option to redirect the file descriptor 0 rather than just sys.stdout. (For those familiar with py.test, I am asking for an equivalent of --capture=fd functionality.)

Unlike patching sys.stdout, which is within reach to most python users, redirecting fd 0 (and restoring it) is not a trivial task.

@abalkin
Copy link
Member

abalkin commented Jul 23, 2013

In my post "fd 0" should have been "fd 1", of course. (Proving that it is not trivial to get it right:-)

@ncoghlan
Copy link
Contributor

Alexander, please read the earlier comments on the issue: we're deliberately *not* doing that. Such functionality is more advanced, and more appropriate for an API in the subprocess module. Anyone interested in exploring that option further should create a separate issue.

This API is about making a *particular* common use case genuinely trivial. Generalising it *isn't* desirable, as any such generalisations will be sufficiently complex that people are better off just rolling their own solution from first principles.

@abalkin
Copy link
Member

abalkin commented Jul 23, 2013

Yes, I did miss Victor's dup2() comment. (It looks like I did not subscribe to this issue from the start and missed early discussion - sorry.)

The simple feature is not very useful for me. I have to deal with too many cases of misguided code like this:

def write_xyz(output=sys.stdout):
   ...

for which

with RedirectStdout(...):
    write_xyz()

will not work.

I will create a separate issue once I have a concrete proposal, but with respect to this specific issue, I think it is better to provide a recipe in contextlib documentation for something like this:

@contextlib.contextmanager
def redirect_stdout(stream):
    old_stdout = sys.stdout
    sys.stdout = stream
    yield
    sys.stdout = old_stdout

With the proposed RedirectStdout, I think many users will want some tweaks and will copy the "from scratch" implementation instead of discovering contextmanager.

@ncoghlan
Copy link
Contributor

Can we go paint bikesheds somewhere else now, please? Raymond has persuaded me as contextlib maintainer that this small addition is worth making as a way of factoring out repeated calls to print with a file redirection in simple user scripts where thread safety isn't a concern.

I think more sophisticated tools are also potentially worthwhile, but *not here*.

@warsaw
Copy link
Member

warsaw commented Jul 23, 2013

On Jul 23, 2013, at 04:24 AM, Alexander Belopolsky wrote:

@contextlib.contextmanager
def redirect_stdout(stream):
old_stdout = sys.stdout
sys.stdout = stream
yield
sys.stdout = old_stdout

Make that:

@contextlib.contextmanager
def redirect_stdout(stream):
    old_stdout = sys.stdout
    sys.stdout = stream
    try:
        yield
    finally:
        sys.stdout = old_stdout

and I'll be able to remove those 8 lines of code from all my other code bases
:)

@python-dev
Copy link
Mannequin

python-dev mannequin commented Oct 10, 2013

New changeset 63a1ee94b3ed by Raymond Hettinger in branch 'default':
Issue bpo-15805: Add contextlib.redirect_stdout()
http://hg.python.org/cpython/rev/63a1ee94b3ed

@vajrasky
Copy link
Mannequin

vajrasky mannequin commented Oct 10, 2013

Nice.

My only complain is the dis.dis example. We don't have to use redirect_stdout context manager.

+ # How to capture disassembly to a string
+
+ import dis
+ import io
+
+ f = io.StringIO()
+ with redirect_stdout(f):
+ dis.dis('x**2 - y**2')
+ s = f.getvalue()

dis.dis supports file object natively. We can do this instead:
dis.dis('x**2 - y**2', file=f)

@ezio-melotti
Copy link
Member

I think this should also be added to the whatsnew.

Regarding the examples, isn't it easier to say that:
with redirect_stdout(sys.stderr):
print('error')
is equivalent to
print('error', file=sys.stderr)
?

I think that in most of the cases users are redirecting something that is being print()ed, and this example gets the point across (even if the "file" arg can be used for this specific case, it is not always the case if print() is called by a function). Capturing help() and especially did.dis() output don't seem to me realistic/intuitive use cases for redirect_stdout().

@birkenfeld
Copy link
Member

Whatsnew: yes please.

As for your second point, I assume Raymond wanted to exemplify usage with an "unfortunate" API that prints to stderr with no option to change it. It just turned out that dis() is not one of those APIs.

For purposes of print(), you're almost always better off using file=x on each print you do.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

10 participants