classification
Title: corrupted contents of stdout result from subprocess call under a trace
Type: Stage:
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, techtonik
Priority: normal Keywords:

Created on 2012-06-05 14:40 by techtonik, last changed 2013-03-22 22:18 by techtonik. This issue is now closed.

Files
File name Uploaded Description Edit
file2.py techtonik, 2012-06-05 14:40
Messages (17)
msg162347 - (view) Author: anatoly techtonik (techtonik) Date: 2012-06-05 14:40
This code dumps a lot of internal source code info when executed with trace as:

python -m trace --trace file2.py


---[file2.py]
import subprocess

def ret():
  output = subprocess.check_output(['hg', 'id', '-nib'])
  print( output )
  print( output )
  print( output.strip() )
  print( output.strip().split() )

ret()
---

Normally, the last line of the output is:
['e67793ec2995+', '2162+', 'default']

But during trace call it is:
['subprocess.py(1180):', '_dup2(errwrite,', '2)', '---', 'modulename:', 'subprocess,', 'funcname:', '_dup2', 'subprocess.py(1174):', 'if', 'a', '==', 'b:', 'subprocess.py(1176):', 'elif', 'a', 'is', 'not', 'None:', 'subprocess.py(1184):', 'closed', '=', '{', 'None', '}',
...
etc.
msg162348 - (view) Author: anatoly techtonik (techtonik) Date: 2012-06-05 14:45
The behavior repeats with PyPy 1.8.0, and doesn't repeat with Python 3.
msg162350 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2012-06-05 15:06
Python2 has a pure python implementation of subprocess, with separate calls to fork() and exec(); so the output of the subprocess contains the trace of the forked Python interpreter, until the exec() system call.
Python3 has a C implementation: _posixsubprocess.fork_exec() won't call the Python interpreter between fork() and exec(), and no trace is emitted.

Yes, Python3 is better :)
msg162352 - (view) Author: anatoly techtonik (techtonik) Date: 2012-06-05 15:34
OMG. =) Is it possible to fix it somehow? Postpone output collection until the very exec() call? Or provide a different stream for collecting output?
msg162354 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2012-06-05 16:09
It's not possible to delay output collection: output starts being collected just after the call os.dup2(cp2write, 1) and before exec(), we need to os.close it.  The trace module will already have emitted some lines.

Process output by definition goes to the C stdout, but you could redirect sys.stdout (the Python one, used by print) to something else, like a StringIO so that trace.py does not pollute the subprocess output.

In 3.2, suprocess.py states that the pure Python implementation (the one use in 2.7) is not thread safe.  We could add that it's not reentrant or trace-friendly as well... this is not surprising IMO.
msg162361 - (view) Author: anatoly techtonik (techtonik) Date: 2012-06-05 17:46
Is it possible to pause trace and resume after the exec call? Between some missing instructions from subprocess internals and traceability of the Python programs I'd choose the latter.

It can be even more actual for people tracing program execution in the process of porting to Python 3.
msg162362 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2012-06-05 18:05
Sure, just save/restore the trace function around your calls to subprocess.
But I fail to see how a Python2-only issue can affect a Python3 port.
msg162370 - (view) Author: anatoly techtonik (techtonik) Date: 2012-06-05 20:16
The fix for saving/restoring trace function belong to subprocess module. Python2 only issue will be actual when you have to port Python2 only app where it works ok to the Python3 where it doesn't work even if it executes successfully.
msg162375 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2012-06-05 22:04
Sorry, what does not work in Python3?
And how can the trace module help you?
msg178138 - (view) Author: anatoly techtonik (techtonik) Date: 2012-12-25 15:17
The trace module helps to gather program flow statistics and see the differences in patterns for large systems when program evolves. In particular, components ported to Python 3 should still behave the same way on Python 2.

Right now the behavior under the trace is broken.
msg184952 - (view) Author: anatoly techtonik (techtonik) Date: 2013-03-22 10:14
This use case is actual for various kind of asynchronous operations.
msg184953 - (view) Author: anatoly techtonik (techtonik) Date: 2013-03-22 10:20
I can't repeat this on Windows. Looks like it is a Linux issue, because of forks.
msg184956 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013-03-22 10:46
So, does this page report an issue with Python2, or Python3?
msg184960 - (view) Author: anatoly techtonik (techtonik) Date: 2013-03-22 11:26
I can not repeat this neither on Python 2.7 nor on Python 3 on Windows Vista. Need to run this on a Linux system to confirm.
msg184963 - (view) Author: anatoly techtonik (techtonik) Date: 2013-03-22 12:44
Python 3 is not affected. Python 2.7, Linux only.
msg184965 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013-03-22 13:07
So the issue can be closed:

- subprocess was rewritten in C on Python3 for good reasons: it's wrong (and dangerous) to run Python code between fork() and exec(); this bug report is another example.

- This change is too large to be merged to 2.7.

- There is a backport available on pypi: https://pypi.python.org/pypi/subprocess32
Yes it's a distinct module, but it's possible to install it in place of the standard one if needed::

    import sys
    import subprocess32 as subprocess
    sys.modules['subprocess'] = subprocess
msg185015 - (view) Author: anatoly techtonik (techtonik) Date: 2013-03-22 22:18
And how to find all such issues for Python 2 that people need to be aware of in this tracker?
History
Date User Action Args
2013-03-22 22:18:17techtoniksetmessages: + msg185015
2013-03-22 13:07:43amaury.forgeotdarcsetstatus: open -> closed
resolution: wont fix
messages: + msg184965
2013-03-22 12:44:45techtoniksetmessages: + msg184963
2013-03-22 11:26:29techtoniksetmessages: + msg184960
2013-03-22 10:46:21amaury.forgeotdarcsetmessages: + msg184956
2013-03-22 10:20:35techtoniksetmessages: + msg184953
2013-03-22 10:15:03techtoniksettitle: corrupted output from subprocess call under a trace -> corrupted contents of stdout result from subprocess call under a trace
2013-03-22 10:14:10techtoniksetmessages: + msg184952
title: trace corrupts return result on chained execution -> corrupted output from subprocess call under a trace
2012-12-25 15:17:52techtoniksetmessages: + msg178138
2012-06-05 22:04:34amaury.forgeotdarcsetmessages: + msg162375
2012-06-05 20:16:43techtoniksetmessages: + msg162370
2012-06-05 18:05:45amaury.forgeotdarcsetmessages: + msg162362
2012-06-05 17:46:00techtoniksetmessages: + msg162361
2012-06-05 16:09:09amaury.forgeotdarcsetmessages: + msg162354
2012-06-05 15:34:11techtoniksetmessages: + msg162352
2012-06-05 15:06:24amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg162350
2012-06-05 14:45:23techtoniksetmessages: + msg162348
components: + Library (Lib)
2012-06-05 14:40:11techtonikcreate