classification
Title: ctypes string pointer fields should accept embedded null characters
Type: behavior Stage: patch review
Components: ctypes Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ZackerySpytz, amaury.forgeotdarc, belopolsky, eryksun, meador.inge, ned.deily, serhiy.storchaka, theller
Priority: critical Keywords: 3.6regression, patch

Created on 2018-02-01 19:54 by theller, last changed 2018-08-10 05:52 by ZackerySpytz.

Files
File name Uploaded Description Edit
nullchars.py theller, 2018-02-01 19:54
Pull Requests
URL Status Linked Edit
PR 8721 open ZackerySpytz, 2018-08-10 05:50
Messages (4)
msg311462 - (view) Author: Thomas Heller (theller) * (Python committer) Date: 2018-02-01 19:54
ctypes Structure fields of type c_char_p or c_wchar_p used to accept strings with embedded null characters.  I noticed that Python 3.6.4 does refuse them.  It seems this has been changed in recent version(s).

There ARE use-cases for this:  The Windows-API OPENFILENAME structure is one example.  The Microsoft docs for the lpstrFilter field:

"""
lpstrFilter

    Type: LPCTSTR

    A buffer containing pairs of null-terminated filter strings. The last string in the buffer must be terminated by two NULL characters.
"""

I have attached a simple script which demonstrates this new behaviour; the output with Python 3.6.4 is this:

Traceback (most recent call last):
  File "nullchars.py", line 8, in <module>
    t.unicode = u"foo\0bar"
ValueError: embedded null character
msg311468 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2018-02-01 20:51
PyUnicode_AsWideCharString was updated to raise ValueError for embedded nulls if the `size` output parameter is NULL. Z_set in cfield.c should be updated to get the size, which can be ignored here. For example:

    Py_ssize_t size; 
    buffer = PyUnicode_AsWideCharString(value, &size);
msg314567 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-03-28 07:14
The change mentioned was made in GH-2462 for Issue13617 and was released in 3.6.3 (and 3.5.4 now in security-fix-only mode).
msg314579 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-03-28 09:07
This is a regression. Eryk's solution LGTM. Do you mind to create a PR?

But u"foo\0bar" is not terminated by two NULL characters. If this is used in real code, it contains a bug. And the getter of this field will return the string only to the first null character. More work is needed for making this more reliable.
History
Date User Action Args
2018-08-10 05:52:17ZackerySpytzsetnosy: + ZackerySpytz
2018-08-10 05:50:54ZackerySpytzsetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request8206
2018-03-28 09:07:54serhiy.storchakasetmessages: + msg314579
2018-03-28 07:14:34ned.deilysetpriority: normal -> critical
nosy: + belopolsky, amaury.forgeotdarc, meador.inge, serhiy.storchaka, ned.deily
messages: + msg314567

2018-02-01 20:51:07eryksunsetversions: + Python 3.7, Python 3.8
nosy: + eryksun

messages: + msg311468

type: behavior
stage: test needed
2018-02-01 19:54:50thellercreate