classification
Title: ctypes.dlopen() doesn't support surrogates
Type: behavior Stage: test needed
Components: ctypes, Library (Lib), Unicode Versions: Python 3.1, Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: theller Nosy List: amaury.forgeotdarc, loewis, theller, vstinner
Priority: normal Keywords: patch

Created on 2010-04-14 01:16 by vstinner, last changed 2010-04-20 20:20 by loewis. This issue is now closed.

Files
File name Uploaded Description Edit
ctypes_dlopen_surrogates-2.patch vstinner, 2010-04-15 23:11
Messages (10)
msg103108 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-14 01:16
The PEP 383 introduces filename using surrogates. ctypes.dlopen() support them. ctypes.cdll.LoadLibrary('libc\uDCff.so.6') fails with:

   UnicodeEncodeError: 'utf-8' codec can't encode character '\udcff' 
   in position 4: surrogates not allowed

Attached patch fixes this issue.

TODO: Remove the assert(PyBytes_Check(name2)). I don't know if PyUnicode_FSConverter() does always return a PyBytes object or not.
msg103115 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-04-14 09:08
PyUnicode_FSConverter returns bytes and bytearray objects unchanged; otherwise it always return bytes.

Your patch should handle the case when name2 is a bytearray.
msg103272 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-15 23:10
amaury> Your patch should handle the case when name2 is a bytearray.

Ok, fixed. I also tested None: Python does segfault :-) New patch rejects None value.
msg103273 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-15 23:11
(oops, my patch included unrelated changes about trailing spaces)
msg103458 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-18 00:03
Fixed: r80159 (py3k), r80160 (3.1). I commited a different version of my patch to support None.
msg103564 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-04-19 09:05
It does not work on Windows:

>>> ctypes.CDLL(b'kernel32')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\afa\python\py3k-1\lib\ctypes\__init__.py", line 350, in __init__
    self._handle = _dlopen(self._name, mode)
TypeError: bad argument type for built-in operation
msg103566 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-19 09:13
load_library() uses LoadLibraryW() which use a WCHAR*. To support bytes, we can use LoadLibraryA() and TCHAR*.
msg103569 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-19 09:23
I only fixed UNIX/BSD versions of subprocess/ctypes.dlopen() because it's not possible to open some files with an undecodable filename. On Windows, the file system and Python3 use Unicode, and so there is no such corner case.

On Windows, should we encourage people migrating from byte to character string? Or should we support byte string for backward compatibility or because there is really a corner case where it's not possible to open a file with an unicode string?

See also my msg103565 (Issue #8393, subprocess),
msg103570 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-04-19 09:25
yes, except that TCHAR* depends on compilation settings (it resolves to wchar_t when UNICODE is #defined); simply use char*.
msg103746 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-04-20 20:20
Amaury, I'm closing this for the same reason I explained in msg103745
History
Date User Action Args
2010-04-20 20:20:24loewissetstatus: open -> closed

nosy: + loewis
messages: + msg103746

resolution: fixed
2010-04-19 09:25:28amaury.forgeotdarcsetmessages: + msg103570
2010-04-19 09:23:33vstinnersetmessages: + msg103569
2010-04-19 09:13:17vstinnersetmessages: + msg103566
2010-04-19 09:05:28amaury.forgeotdarcsetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg103564
2010-04-18 00:03:55vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg103458
2010-04-15 23:11:49vstinnersetfiles: + ctypes_dlopen_surrogates-2.patch

messages: + msg103273
2010-04-15 23:11:21vstinnersetfiles: - ctypes_dlopen_surrogates-2.patch
2010-04-15 23:10:55vstinnersetfiles: - ctypes_dlopen_surrogates.patch
2010-04-15 23:10:44vstinnersetfiles: + ctypes_dlopen_surrogates-2.patch

messages: + msg103272
2010-04-14 09:08:22amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg103115
2010-04-14 03:20:42brian.curtinsetnosy: + theller
assignee: theller
components: + ctypes
type: behavior
stage: test needed
2010-04-14 01:16:36vstinnerlinkissue8242 dependencies
2010-04-14 01:16:19vstinnercreate