classification
Title: subprocess: support undecodable current working directory on POSIX OS
Type: Stage:
Components: Library (Lib), Windows Versions: Python 3.1, Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, loewis, vstinner
Priority: normal Keywords: patch

Created on 2010-04-14 00:55 by vstinner, last changed 2010-04-20 20:19 by loewis. This issue is now closed.

Files
File name Uploaded Description Edit
posixsubprocess_cwd-2.patch vstinner, 2010-04-15 23:17
Messages (11)
msg103105 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-14 00:55
In py3k, subprocess uses _posixsubprocess.fork_exec() function. This function uses surrogateescape error handler for most arguments, but not for the current working directory (cwd).

Attached patch uses PyUnicode_FSConverter() as done for other arguments. I don't know if PyUnicode_FSConverter() result is always a PyBytes, so I added an assertion. It should be fixed.
msg103106 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-14 00:55
See also #8391.
msg103274 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-15 23:17
New patch supporting bytearray type (remove the buggy assertion).
msg103380 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-16 23:57
Commited in py3k (r80135), blocked in 3.1 (r80136).

Python 3.1 has no _posixsubprocess module, it uses os.chdir() which accepts bytes, bytearray and str with surrogates.
msg103559 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-04-19 08:15
It does not work on Windows:

>>> subprocess.Popen("c:/windows/notepad.exe", cwd=b'c:/temp')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\afa\python\py3k-1\lib\subprocess.py", line 681, in __init__
    restore_signals, start_new_session)
  File "D:\afa\python\py3k-1\lib\subprocess.py", line 892, in _execute_child
    startupinfo)
TypeError: must be string or None, not bytes

The function sp_CreateProcess() in PC/_subprocess.c should probably be fixed.
And please add unit tests...
msg103563 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-19 09:03
> And please add unit tests...

I'm thinking on this. I plan to write tests for all my last changes about surrogates.
msg103565 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-19 09:09
> It does not work on Windows

I always consider Windows as a special case because Windows uses unicode internally. Byte string are converted quickly to unicode using the current locale.

My patch was for UNIX/BSD which uses byte string internally.

sp_CreateProcess() in PC/_subprocess.c uses CreateProcessW. To support byte string, we should use the byte string version of CreateProcess ("CreateProcessA" ?) or convert current directory to unicode (using the current locale). The second solution minimize the changes.
msg103568 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-04-19 09:22
PEP 277 explicitly states that unicode strings should be passed to wide-character functions, whereas byte strings use "standard" functions.
This is done in posixmodule.c, for example.

The "current locale" is a moving thing.
msg103572 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-19 10:02
> PEP 277 explicitly states that unicode strings should be passed to
>  wide-character functions, whereas byte strings use "standard"
> functions. This is done in posixmodule.c, for example.

CreateProcessW takes a lot of arguments. Should we CreateProcessA or CreateProcessW if some arguments are byte string and other are unicode string?

> The "current locale" is a moving thing.

Don't CreateProcessA do the same thing? Convert byte string to unicode using the current locale. To use the right words, by "current locale" I mean the "mbcs" Windows codec.
msg103611 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-04-19 15:59
"mbcs" is not a fixed encoding and may change between Windows sessions, see the Rationale in PEP277 http://www.python.org/dev/peps/pep-0277/

The mixed case is interesting. We could use CreateProcessW when at least one string is Unicode, and CreateProcessA when all strings are bytes.
OTOH this is unlikely, because for example the environment will be passed as unicode; so I'm OK to use CreateProcessW in all cases, and use mbcs to convert bytes.
msg103745 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-04-20 20:19
Amaury, I fail to see how the error you get on Windows is related to this issue. AFAICT, neither the issue nor the patch affects Windows at all.

Closing the issue as fixed.

If you think there is also an issue on Windows, please submit a new bug report. However, I don't think the error you get is a bug: what you see is correct behavior.
History
Date User Action Args
2010-04-20 20:19:48loewissetstatus: open -> closed

nosy: + loewis
messages: + msg103745

resolution: fixed
2010-04-19 15:59:11amaury.forgeotdarcsetmessages: + msg103611
2010-04-19 10:02:54vstinnersetmessages: + msg103572
2010-04-19 09:23:00amaury.forgeotdarcsetmessages: + msg103568
2010-04-19 09:09:33vstinnersetmessages: + msg103565
2010-04-19 09:03:29vstinnersetmessages: + msg103563
2010-04-19 08:15:53amaury.forgeotdarcsetstatus: closed -> open

nosy: + amaury.forgeotdarc
messages: + msg103559

resolution: fixed -> (no value)
2010-04-16 23:57:14vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg103380
2010-04-15 23:17:28vstinnersetfiles: - posixsubprocess_cwd.patch
2010-04-15 23:17:22vstinnersetfiles: + posixsubprocess_cwd-2.patch

messages: + msg103274
2010-04-14 01:09:07vstinnerlinkissue8242 dependencies
2010-04-14 00:55:50vstinnersetmessages: + msg103106
2010-04-14 00:55:22vstinnercreate