classification
Title: subprocess: support bytes program name (POSIX)
Type: Stage:
Components: Library (Lib), Unicode Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: 8514 8603 Superseder:
Assigned To: Nosy List: Arfrever, gregory.p.smith, haypo
Priority: normal Keywords: needs review, patch

Created on 2010-04-23 22:42 by haypo, last changed 2011-03-03 12:55 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
subprocess_bytes_program-5.patch haypo, 2010-05-17 00:21
Messages (21)
msg104059 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-04-23 22:42
While fixing #8391, I realized that subprocess doesn't support bytes program name if it's not an absolute path:
-------
$ ./python
>>> import subprocess
>>> subprocess.call([b'echo'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/SHARE/SVN/py3k/Lib/subprocess.py", line 449, in call
    return Popen(*popenargs, **kwargs).wait()
  File "/home/SHARE/SVN/py3k/Lib/subprocess.py", line 681, in __init__
    restore_signals, start_new_session)
  File "/home/SHARE/SVN/py3k/Lib/subprocess.py", line 1116, in _execute_child
    for exe in executable_list)
  File "/home/SHARE/SVN/py3k/Lib/subprocess.py", line 1115, in <genexpr>
    executable_list = tuple(fs_encode(exe)
  File "/home/SHARE/SVN/py3k/Lib/subprocess.py", line 1114, in <genexpr>
    for dir in path_list)
  File "/home/SHARE/SVN/py3k/Lib/posixpath.py", line 75, in join
    if b.startswith(sep):
TypeError: expected an object with the buffer interface
[62826 refs]
-------

I'm working on a patch.
msg104061 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-04-23 23:16
Attached patch (draft version) fixes this issue.
msg104065 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-04-23 23:48
I proposed the creation of fs_encode() and fs_decode() functions in os.path: see #8514. There functions would simplify my patch, but also make it correct on all OS (Windows, Mac, Linux).
msg104208 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-04-26 10:37
My patch changes:
 * os._execvpe(): support bytes type for the file argument (program name)
 * os.get_exec_path(): support bytes type for the PATH environment variable
 * Popen._execute_child(): decode the executable name before encoding the arguments (if the program name is not an absolute path)
msg104922 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-04 11:10
Bytes program name problem should be splitted in two parts:

 (a) subprocess.call([b'env']) and subprocess.call([b'env'], env={'PATH': '/usr/bin'}): bytes program and unicode environ
 (b) bytes program and bytes environ

Part (a)
--------

(a) would benefit from the creation of os(.path).fsencode() function (issue #8514).

Part (b)
--------

If the creation of os.environb is accepted (#8603), I think that subprocess should also be modified to support pure bytes environ. Mix bytes and unicode variables in env would lead to mojibake and headaches.

So I propose the creation of an "envb" (environment bytes) argument to Popen constructor. (b) would be written as:

   subprocess.call([b'env'], envb={b'PATH': b'/usr/bin'})

or

   envb = os.environb.copy()
   envb[b'PATH'] = b'/usr/bin'
   subprocess.call([b'env'], envb=envb)

(and it will not be possible to define env and envb at the same time)

In this case, we will need a pure bytes version of os.get_exec_path(), and os._execvpe() should have a "bytes environment" mode (add an optional argument).

--

I have a patch fixing both problems, parts (a) and (b), but the patch depends on os.environb. I prefer to wait until os.environb is accepted to submit my patch.
msg105169 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-06 22:49
New patch (issue8513_partA.patch):
 - don't *decode*, only encode (str->bytes)
 - only patch os._execvpe() for POSIX
msg105170 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-06 22:50
> If the creation of os.environb is accepted (#8603), I think that
> subprocess should also be modified to support pure bytes environ. 

I fixed #8603 and opened #8640 for subprocess and envb.
msg105262 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2010-05-08 04:53
I think your partA patch makes sense.

It would benefit from fsencode/fsdecode functions rather than manually doing the 'surrogateescape' thing everywhere.

Also, could you add a unittest for os._execvpe to test its behavior?
msg105276 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-08 11:01
> I think your partA patch makes sense.

I can fix part A and B in two commits.

> It would benefit from fsencode/fsdecode functions rather 
> than manually doing the 'surrogateescape' thing everywhere.

I choosed to drop the idea of fsdecode() (patch for part A doesn't decode bytes anymore, it only encodes str). #8514 has now a short and simple patch. I'm waiting for the final decision on #8514 to commit the part A.

> Also, could you add a unittest for os._execvpe to test its behavior?

os._execvpe() is a protected function. issue8513_partA.patch includes a test for subprocess. test_subprocess in two twices: with _posixsubprocess (C module) and with the pure Python implementation. The pure Python implementation calls os._execvpe(), that's why I patched also this function in my patch ;-)
msg105282 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-08 11:21
Update the patch to use os.fsencode().
msg105322 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2010-05-08 18:06
Build on the os._execvpe unittest I added in py3k r81001.  Protected functions are perfectly fine things to unittest.
msg105365 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-09 03:02
> Build on the os._execvpe unittest I added in py3k r81001.

The test fails on Windows.

======================================================================
FAIL: test_internal_execvpe (test.test_os.ExecTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\test_os.py", line 683, in test_internal_execvpe
    exec_stubbed.calls)
AssertionError: Lists differ: [('execve', '/p/f', (['-a'], {... != [('execve', '/p\\f', (['-a'], ...

First differing element 0:
('execve', '/p/f', (['-a'], {'spam': 'beans'}))
('execve', '/p\\f', (['-a'], {'spam': 'beans'}))

- [('execve', '/p/f', (['-a'], {'spam': 'beans'})),
?                ^

+ [('execve', '/p\\f', (['-a'], {'spam': 'beans'})),
?                ^^

-  ('execve', '/pp/f', (['-a'], {'spam': 'beans'}))]
?                 ^

+  ('execve', '/pp\\f', (['-a'], {'spam': 'beans'}))]
?                 ^^
msg105370 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2010-05-09 03:37
my bad.  hopefully r81019 fixes that.
msg105843 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-16 02:15
New patch fixing this issue:
 - os.get_exec_path() type now depends on the OS: str on Windows, bytes on Unix
 - os.get_exec_path(None) uses os.environ on Windows, os.environb on Unix
 - os.get_exec_path(env) uses 'PATH' or b'PATH' key, but raise a ValueError if both keys exist
 - add os.supports_bytes_environ flag (boolean)
 - os._execvpe() and subprocess._execute_child() canonicalize the program to bytes
 - test "not path.supports_unicode_filenames" to check if fsencode() should be defined and used (instead of testing name != "nt")

I'm not proud of the change on os.get_exec_path() result type, I'm not sure that it's the right thing to do. But at least, the patch works :-)
msg105882 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-16 21:46
Oops, I forgot to add the new patch: subprocess_bytes_program-3.patch.
msg105888 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-16 23:14
I asked on #python-dev about os.get_exec_path() result type. As expected, the answer was "It's a really bad idea".

So here is a new version of my patch. Summary of the patch version 4:
 - subprocess.Popen() and os._execvpe() support bytes program name
 - os.get_exec_path() now supports b'PATH' key and bytes value
 - add os.supports_bytes_environ flag: "True if the native OS type of the environment is bytes (eg. False on Windows)"

Changes since the version 3:
 - document the new os.supports_bytes_environ flag
 - os.get_exec_path() result type is always str (decode bytes to str using sys.getfilesystemencoding() + surrogateescape)
 - path.supports_unicode_filenames is False on Windows 9x but Windows 9x native type is unicode and so fsencode() should not be defined on this OS: revert the fsencode() test (if not path.supports_unicode_filenames => if name != 'nt')
msg105890 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-17 00:19
I forgot to update test_os: patch version 5.
msg105891 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-17 00:31
Ok, everything is ready for this issue: os.environb and os.fsencode() are commited, and test_os is prepared to test os._execvpe() change. I'm just waiting for a review.

Execpt the change on os.get_exec_path(), the patch is now simple.
msg105962 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-18 10:36
Hum, os.get_exec_path() has no test for the new features (support b'PATH' key and bytes value).
msg105991 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-05-18 17:25
Fixed by r81291 + r81292 (py3k).

The final commit contains much more tests ;-) I will watch the buildbot next hours and block the commit in 3.1.
msg129964 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-03-03 12:55
Oh, I forget subprocess.call(b'ls'): command as a byte string => fixed in Python 3.3 (r88720).
History
Date User Action Args
2011-03-03 12:55:08hayposetnosy: gregory.p.smith, haypo, Arfrever
messages: + msg129964
2010-05-18 17:25:07hayposetstatus: open -> closed
resolution: fixed
messages: + msg105991
2010-05-18 10:36:29hayposetmessages: + msg105962
2010-05-17 00:31:40hayposetkeywords: + needs review

messages: + msg105891
2010-05-17 00:21:12hayposetfiles: + subprocess_bytes_program-5.patch
2010-05-17 00:21:06hayposetfiles: - subprocess_bytes_program-4.patch
2010-05-17 00:21:03hayposetfiles: - issue8513_partA-fsencode.patch
2010-05-17 00:21:00hayposetfiles: - issue8513_partA.patch
2010-05-17 00:20:02hayposetfiles: - subprocess_bytes_program-4.patch
2010-05-17 00:19:57hayposetfiles: + subprocess_bytes_program-4.patch
2010-05-17 00:19:40hayposetmessages: + msg105890
2010-05-16 23:15:13hayposetfiles: - subprocess_bytes_program-3.patch
2010-05-16 23:14:54hayposetfiles: + subprocess_bytes_program-4.patch

messages: + msg105888
2010-05-16 21:46:36hayposetfiles: + subprocess_bytes_program-3.patch

messages: + msg105882
2010-05-16 02:22:48haypolinkissue8640 dependencies
2010-05-16 02:15:06hayposetmessages: + msg105843
2010-05-09 03:37:15gregory.p.smithsetmessages: + msg105370
2010-05-09 03:02:36hayposetmessages: + msg105365
2010-05-08 18:07:00gregory.p.smithsetmessages: + msg105322
2010-05-08 11:21:57hayposetfiles: + issue8513_partA-fsencode.patch

messages: + msg105282
2010-05-08 11:01:36hayposetmessages: + msg105276
2010-05-08 04:53:15gregory.p.smithsetmessages: + msg105262
2010-05-07 00:17:00hayposetdependencies: + Add fsencode() functions to os module
2010-05-06 22:50:46hayposetmessages: + msg105170
2010-05-06 22:49:46hayposetmessages: + msg105169
2010-05-06 22:48:41hayposetfiles: - subprocess_bytes_program.patch
2010-05-06 22:48:24hayposetfiles: + issue8513_partA.patch
2010-05-04 11:10:40hayposetdependencies: + Create a bytes version of os.environ and getenvb(), - Add fsencode() functions to os module
messages: + msg104922
2010-04-30 16:06:35hayposettitle: subprocess: support bytes program name -> subprocess: support bytes program name (POSIX)
2010-04-26 10:37:02hayposetmessages: + msg104208
2010-04-24 14:52:07Arfreversetnosy: + Arfrever
2010-04-23 23:48:01hayposetdependencies: + Add fsencode() functions to os module
messages: + msg104065
2010-04-23 23:16:01hayposetfiles: + subprocess_bytes_program.patch
keywords: + patch
messages: + msg104061
2010-04-23 22:42:56gregory.p.smithsetnosy: + gregory.p.smith
2010-04-23 22:42:05haypocreate