This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Subprocess timeout causes output to be returned as bytes in text mode
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eryksun, giampaolo.rodola, macdjord
Priority: normal Keywords:

Created on 2021-03-08 05:41 by macdjord, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
test_subprocess.py macdjord, 2021-03-08 05:41 Demonstration of issue
Messages (3)
msg388257 - (view) Author: Jordan Macdonald (macdjord) Date: 2021-03-08 05:41
Passing the argument `text=True` to `subprocess.run()` is supposed to mean that any captured output of the called process is automatically decoded and retuned to the user as test instead of bytes.

However, if you give a timeout and that timeout expires, the raised `subprocess.TimeoutExpired` exception will have the captured output as as bytes even if text mode is enabled.

Test output:
bash-5.0$ python3 test_subprocess.py
Version and interpreter information: namespace(_multiarch='x86_64-linux-gnu', cache_tag='cpython-37', hexversion=50792432, name='cpython', version=sys.version_info(major=3, minor=7, micro=7, releaselevel='final', serial=0))
Completed STDOUT Type: <class 'str'>
Completed STDOUT Content: 'Start\nDone\n'
Timeout STDOUT Type: <class 'bytes'>
Timeout STDOUT Content: b'Start\n'
msg388265 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-03-08 12:58
communicate() is incomplete, so decoding the output may fail. For example, say the encoding is UTF-8, and the last multibyte character sequence (2-4 bytes) is incomplete. Maybe communicate() should always set `stdout_bytes` and `stderr_bytes` attributes on the timeout exception, and, in text mode, try to decode the output as `stdout` and/or `stderr`. If decoding fails, set the decoded value to None.

In Windows, run() tries to complete communication, which is dysfunctional in cases. I created bpo-43346 to propose changing the design in Windows, in order to address 3 cases that can cause subprocess.run() to ignore the given timeout. The proposed change also sets an incomplete read of stdout and stderr as bytes objects, regardless of text mode, because I was simply matching what POSIX does in this case.
msg388272 - (view) Author: Jordan Macdonald (macdjord) Date: 2021-03-08 17:30
Eryk Sun: Well, I think step 1 should be to update the documentation for Python 3.7 through 3.10 on `subprocess.run()` and `subprocess.TimeoutExpired` to clearly state that `TimeoutExpired.stdout` and `TimeoutExpired.stderr` will be in bytes format even if text mode is set.

If we went with the model of having `stdout_bytes` and attempting to decode into `stdout`, we'd want an option to ignore a trailing decoding error.
History
Date User Action Args
2022-04-11 14:59:42adminsetgithub: 87597
2021-03-30 18:57:45eryksunsetversions: - Python 3.7
2021-03-08 17:30:10macdjordsetmessages: + msg388272
versions: + Python 3.7
2021-03-08 12:58:06eryksunsetnosy: + giampaolo.rodola, eryksun

messages: + msg388265
versions: + Python 3.8, Python 3.9, Python 3.10, - Python 3.7
2021-03-08 05:41:46macdjordcreate