classification
Title: subprocess.Popen doesn't support unicode on Windows
Type: behavior Stage: resolved
Components: Library (Lib), Unicode, Windows Versions: Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: Valentin LAB, akira, ezio.melotti, haypo, loewis, peter0, terry.reedy
Priority: normal Keywords:

Created on 2013-10-15 05:43 by peter0, last changed 2017-03-15 10:39 by Valentin LAB. This issue is now closed.

Messages (9)
msg199976 - (view) Author: Peter Graham (peter0) Date: 2013-10-15 05:43
On Windows, subprocess.Popen requires the executable name and working directory to be ascii.  This is because it calls CreateProcess instead of CreateProcessW.
msg200334 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-10-18 23:56
The docs say that args should be a string or sequence of strings. It also says "On Windows, the class uses the Windows CreateProcess() function.", so it is not a bug for it to do that. However, CreateProcessW sounds like a good (and overdue, and overlooked) enhancement, if it indeed is still not used in 3.4. (Have you checked?)

On 2.7, 'string' may or may not include unicode strings. On 3.x, it definitely does, and may or may not include bytes. If 3.x restricts unicode strings to ascii text (at least on Windows), the doc should say so.
msg214497 - (view) Author: Akira Li (akira) * Date: 2014-03-22 17:54
I've checked the source code for 3.4; `subprocess` uses `_winapi.CreateProcess` on Windows [1] that in turn uses `CreateProcessW` [2]. CreateProcessA is not used.

`Popen` should already support Unicode on Windows though I don't see explicit tests for non-ascii arguments or arguments that can't be encoded using `mbcs` character encoding.

[1]: http://hg.python.org/cpython/file/3.4/Lib/subprocess.py#l1063
[2]: http://hg.python.org/cpython/file/3.4/Modules/_winapi.c#l579
msg214507 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-03-22 19:03
Peter, can you post 1 or more failing examplex?  With some, I might be persuaded that this is a bug.

Victor, I know you have worked in this area. Any opinions?
msg214513 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-03-22 20:17
I don't understand this issue. Python 3 uses CreateProcessW() and so
support the full Unicode range. I suggest to close this issue as invalid.

Popen() parameters are Unicode strings. It may support bytes string, but
passing bytes to OS functions on Windows is deprecated, especially for
filenames.
msg214517 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-03-22 21:06
The issue is really that Terry had removed 2.7 from the list of affected versions, and added 3.4 instead. The original issue was reported against 2.7, where the observation that it uses CreateProcess is correct:

http://hg.python.org/cpython/file/babb9479b79f/PC/_subprocess.c#l463

The OP's observation that this restricts the supported executable names to be ascii is incorrect. Instead, any string in the CP_ACP ("ANSI") encoding of the system would work, which practically allows to access all directories on a typical installation.

I'd close this as "won't fix", except that I recall a recent discussion that lack of Unicode support in some API is considered a bug in 2.7.

So: patches welcome (not really - I wouldn't mind if this stays open until 2.7 is properly retired).
msg214520 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-03-22 21:59
You're right, the bug/enhancement border for 2.7 unicode issues has been shifted a bit (or perhaps made a bit more consistent). A failing case is still needed.
msg214825 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-03-25 13:46
>  The original issue was reported against 2.7

Oh... Ok :-)

It's tricky to fix this issue in Python 2.7 because you have to choose which function is used: CreateProcessA() (bytes) or CreateProcessW() (Unicode). To use CreateProcessW(), you have to decode bytes parameter. Python 3 has os.fsencode()/os.fsdecode() functions, similar functions in C. The "mbcs" Python codec is strict by default, but it now supports any Python error handler. This change changed was improved in each Python 3 release.

Python 2 has PyUnicode_DecodeMBCSStateful() and PyUnicode_EncodeMBCS() which use the default Windows behaviour. I'm not sure that using PyUnicode_DecodeMBCSStateful() (or directly MultiByteToWideChar) + CreateProcessW() is exactly the same than calling CreateProcessA().

Should we support CreateProcessA() and CreateProcessW(), and use one or the other depending on the type of the pararameters?

IMO such change requires too much work and it is not enough to have a full Unicode support for filenames. You have to fix much more code. I already did all this work in Python 3 (in 3.1, 3.2 and then 3.3). I suggest you to upgrade to port your application to Python 3 if you want a full Unicode support. Using Unicode in Python 3 is natural and just works fine.

So I still suggest to close this issue as wontfix.

--

Similar discussions on Python 3:
http://bugs.python.org/issue8393#msg103565
http://bugs.python.org/issue8514#msg104224
msg289664 - (view) Author: Valentin LAB (Valentin LAB) Date: 2017-03-15 10:39
For eventual other people wanting a workaround, this is the code I used to leverage ``ctypes`` and redo what last python 3 code is doing. Any comment are welcome, this is my first go at ``ctypes``. I didn't extensively tested the code... so use at your own risk. The Gist might evolve if some people find issues:

https://gist.github.com/vaab/2ad7051fc193167f15f85ef573e54eb9

Tests/Usecases are simple: use ``subprocess.Popen(..)`` to simply issue a ``git commit -am "ć"``. And display the changelog. You can also call another python script, that would then need this other recipe (this time to get ``sys.argv`` correctly encoded on windows):

http://code.activestate.com/recipes/572200/

Hope that helps.
History
Date User Action Args
2017-03-15 10:39:06Valentin LABsetnosy: + Valentin LAB
messages: + msg289664
2014-03-25 21:33:58terry.reedysetstatus: open -> closed
resolution: wont fix
stage: test needed -> resolved
2014-03-25 13:46:08hayposetmessages: + msg214825
2014-03-22 21:59:56terry.reedysettype: enhancement -> behavior
messages: + msg214520
2014-03-22 21:06:08loewissetnosy: + loewis

messages: + msg214517
versions: + Python 2.7, - Python 3.4, Python 3.5
2014-03-22 20:17:39hayposetmessages: + msg214513
2014-03-22 19:03:57terry.reedysetversions: + Python 3.5
nosy: + haypo

messages: + msg214507

stage: test needed
2014-03-22 17:54:30akirasetnosy: + akira
messages: + msg214497
2013-10-18 23:56:42terry.reedysetversions: + Python 3.4, - Python 2.7
nosy: + terry.reedy

messages: + msg200334

type: behavior -> enhancement
2013-10-15 05:43:16peter0create