This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Python subprocess not honoring append mode for stdout on Windows
Type: behavior Stage:
Components: IO, Windows Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eryksun, paul.moore, steve.dower, tim.golden, wolfgang-kuehn, zach.ware
Priority: normal Keywords:

Created on 2021-09-17 21:11 by wolfgang-kuehn, last changed 2022-04-11 14:59 by admin.

Messages (5)
msg402096 - (view) Author: wolfgang kuehn (wolfgang-kuehn) Date: 2021-09-17 21:11
On Windows, if you pass an existing file object in append mode to a subprocess, the subprocess does **not** really append to the file:

1. A file object with `Hello World` content is passed to the subprocess
2. The content is erased
3. The subprocess writes to the file
4. The expected output does not contain `Hello World`

Demo:

    import subprocess, time, pathlib, sys
    print(f'Caller {sys.platform=} {sys.version=}')

    pathlib.Path('sub.py').write_text("""import sys, time
    time.sleep(1)
    print(f'Callee {sys.stdout.buffer.mode=}')""")
    
    file = pathlib.Path('dummy.txt')
    file.write_text('Hello World')
    popen = subprocess.Popen([sys.executable, 'sub.py'], stdout=file.open(mode='a'))
    file.write_text('')
    time.sleep(2)
    print(file.read_text())

Expected output on Linux

    Caller sys.platform='linux' sys.version='3.8.6'
    Callee sys.stdout.buffer.mode='wb'

Unexpected bad output on Windows

    Caller sys.platform='win32' sys.version='3.8.6'
    NULNULNULNULNULNULNULNULNULNULNULCallee sys.stdout.buffer.mode='wb'

Note that the expected output is given on Windows if the file is opened in the subprocess via `sys.stdout = open('dummy.txt', 'a')`. So it is definitely a subprocess thing.
msg402112 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-09-18 00:24
In Windows, the C runtime's append mode doesn't use the native file append mode. The CRT just opens the file in read-write mode and seeks to the end, initially and before each write. 

subprocess.Popen() doesn't implement inheritance of file descriptors, so the CRT append mode gets lost, but the native file pointer is of course retained. After file.write_text('') in the parent overwrites the file, writes to stdout in the child process begin at the file pointer. Null values are written to the initial 11 bytes.

You could use os.spawnv() to inherit file descriptors. The C runtime uses reserved fields in the process STARTUPINFO in order to marshal the fd data to the child process. This depends on the child implementing a compatible scheme for C file descriptors, which of course is the case for applications that use MSVC. A downside is that this won't be thread-safe since you'll have to temporarily redirect stdout in the current process. For example:

    old_stdout = os.dup(1)
    with file.open(mode='a') as f:
        os.dup2(f.fileno(), 1)
    try:
        os.spawnv(os.P_NOWAIT, sys.executable, ['python', 'sub.py'])
    finally:
        os.dup2(old_stdout, 1)

Alternatively, you could open the file with native append mode and wrap the OS handle in a file descriptor. For example:

    import os
    import msvcrt
    from win32file import *
    from ntsecuritycon import FILE_APPEND_DATA

    h = CreateFile(os.fspath(file), FILE_APPEND_DATA, 
            FILE_SHARE_READ | FILE_SHARE_WRITE, None, OPEN_EXISTING, 0, None)
    fd = msvcrt.open_osfhandle(h.Detach(), 0)

    with open(fd) as f:
        popen = subprocess.Popen([sys.executable, 'sub.py'], stdout=f)
msg402159 - (view) Author: wolfgang kuehn (wolfgang-kuehn) Date: 2021-09-19 18:23
The second alternative (wrapping the OS handle in a file descriptor) works like a charm, and is the less invasive workaround code-wise.

Thanks for the magic, which I must respect as such :-)

Still I feel that this is a bug since (a) it shows an unexpected behaviour, and (b) it fails to abstract away the operating system.
msg402169 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-09-19 20:20
There's nothing we could easily change to use the native OS append mode or support inheritance of file descriptors in subprocess. A general solution would be to give up on C file descriptors and CRT functions such as _wopen(), read(), etc, and instead implement our own I/O and filesystem support in the os and io modules, based on native handles and the Windows API. This change has been discussed, but I don't know whether or not it's just a pipe dream.
msg402252 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-09-20 16:01
> This change has been discussed, but I don't know whether or not it's just a pipe dream

Still a bit of a pipe dream, but I'll add this issue as something that would be fixed by it (to stack up against the list of things that would be broken...)
History
Date User Action Args
2022-04-11 14:59:50adminsetgithub: 89400
2021-09-20 16:01:43steve.dowersetmessages: + msg402252
2021-09-19 20:20:56eryksunsetmessages: + msg402169
2021-09-19 18:23:49wolfgang-kuehnsetmessages: + msg402159
2021-09-18 00:24:26eryksunsetversions: + Python 3.9, Python 3.10, Python 3.11, - Python 3.8
nosy: + paul.moore, tim.golden, eryksun, zach.ware, steve.dower

messages: + msg402112

components: + Windows
2021-09-17 21:11:43wolfgang-kuehncreate