This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ctypes.create_string_buffer does not add NUL if len(init) == size
Type: behavior Stage: patch review
Components: ctypes, Library (Lib) Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: amaury.forgeotdarc, belopolsky, docs@python, eryksun, ezio.melotti, krista, meador.inge, terry.reedy, tom.pohl, willingc
Priority: normal Keywords: easy, patch

Created on 2015-08-07 10:18 by tom.pohl, last changed 2022-04-11 14:58 by admin.

Files
File name Uploaded Description Edit
create_string_buffer.patch krista, 2016-01-09 21:28 review
Messages (8)
msg248183 - (view) Author: Tom Pohl (tom.pohl) * Date: 2015-08-07 10:18
From the ctypes.create_string_buffer docs:
"""If a bytes object is specified as first argument, the buffer is made one item larger than its length so that the last element in the array is a NUL termination character. An integer can be passed as second argument which allows to specify the size of the array if the length of the bytes should not be used."""

Based on this documentation I would expect a NUL-terminated byte array in any case. However, when I do this

>>> for size in range(5, 2, -1): print(size, ctypes.create_string_buffer(b'123', size).raw)
5 b'123\x00\x00'
4 b'123\x00'
3 b'123'

I get b'123' for size=3 without a NUL. My expectation would be the same exception as I get for create_string_buffer(b'123', 2).
msg248211 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2015-08-07 17:54
Not every buffer is null-terminated. That's just the assumption used if the size isn't specified. The documentation can possibly be reworded to make this clearer, but the function itself shouldn't be changed.
msg248219 - (view) Author: Tom Pohl (tom.pohl) * Date: 2015-08-07 19:41
I agree: not every buffer is null-terminated.

But the function name suggests that it creates a _string_ buffer which will most likely be used as an input to a C function. There, it can easily trigger a buffer overflow without a null termination which can be considered a severe security risk.
msg248222 - (view) Author: Tom Pohl (tom.pohl) * Date: 2015-08-07 19:54
If one needs to set a general buffer (i.e. not a null-terminated string buffer) one could always use:

>>> string = (ctypes.c_char*4)()
>>> string.raw = b'abcd'
msg257862 - (view) Author: Krista Paasonen (krista) * Date: 2016-01-09 21:28
Patch containing checking for buffer size, so that NULL value is the last byte as C standard specifies. Raises ValueError exception if initial value does not fit into to the buffer with NULL char.

This should decrease the possibility of creating security issues.
msg257876 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2016-01-09 23:55
I didn't want to change the function in lieu of breaking someone's code. If this change is accepted, then it at least needs a documentation note to indicate the new behavior.
msg258561 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-01-18 23:48
(Tracker notes:

I added as nosy the people listed as active 'experts' for ctypes on https://docs.python.org/devguide/experts.html#experts.  This was easily done by going to the end of the nosy list, typing a comma ',', typing 'ctypes', and then clicking the box that appeared.  This can be done for any module and the other topics listed on the page.

The Documentation component is for issues that only change the docs, and not the code.  That is why Documentation issues are auto-assigned to docs@python.  Adding 'Documentation' amounts to rejecting this patch or anything else that changes the code.

asyncio, ctypes, IDLE (idlelib), IO, and (T)tkinter are all parts of the stdlib and AFAIK, issues marked for them do not have to also be marked 'Library'.)
---

I looked at ctypes.py with hg annotate.  Create_string_buffer is part of Thomas Heller's original 2006-03-08 patch that moved ctypes from an external source into the stdlib.  The only changes are in the isinstance class checks and the raise statement; the conditional bodies, including the one in question, are unchanged.

Tom, we disagree on our reading of the current docs.  The default number of NULL bytes added is 1.  Is the second argument required to be large enough to keep the number positive?  You think yes, I think no, though I agree with Eryk that the second quoted sentence could and should be clearer.  I will not assume that T. Heller meant 'yes' when he wrote 'no' in the code.  What do the listed experts think?

If the doc matches the code, there is no implementation bug and this is not a behavior issue. It is still possible to request a design change as an enhancement.  I think this would require agreement of at least two core developers.  A deprecation notice would normally be needed.  A third possibility is to decide that this is a security issue severe enough to possibly break code in 3.6 and possibly sooner.  I think this would require pydev discussion.

One problem with changing ctypes is that it is not used in the stdlib, so we have no local examples to draw on.  In this case, the question would be how often is 'size' used to suppress the default NULL byte and how legitimate are such uses.
msg389050 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-03-19 01:56
> The Documentation component is for issues that only change the docs

That's not clear in the triaging guide for the multi-select component field. (However, it is clearly stated as such for the GitHub PR label "type-documentation".) If that's really the case, then I'll manually nosy the docs team in cases such as this. When wording is disputed and needs to be clarified, I want an expert at documentation to propose or review the change.

> Adding 'Documentation' amounts to rejecting this patch or anything 
> else that changes the code.

That was not my intent. I accepted Tom's position that "string" means a C string, which must be null-terminated. So I added the "Lib" tag and left it as a "behavior" issue instead of changing it to "enhancement". 

I'm concerned that the old c_buffer() function is defined to call create_string_buffer(), and it's not officially deprecated in the docs or the source code. (There's a commented-out deprecation warning.) A related concern is that the documentation says that the length of a byte-string initializer should not be used if the size is specified, which allows creating a character-array that's not null-terminated. If it raises a ValueError in this case, the wording should be clear that the value of `size` must be large enough to set the initial value as a null-terminated string. I also would want c_buffer() to get a separate implementation in this case. 

If accepted, create_unicode_buffer(init, size) should also be changed to require that init is set as a null-terminated string.
History
Date User Action Args
2022-04-11 14:58:19adminsetgithub: 69011
2021-03-19 01:57:45eryksunsetcomponents: + ctypes
2021-03-19 01:56:09eryksunsetpriority: low -> normal
versions: + Python 3.8, Python 3.9, Python 3.10, - Python 3.6
messages: + msg389050

components: + Library (Lib)
stage: needs patch -> patch review
2016-02-29 15:35:00berker.peksagsetcomponents: - Devguide
2016-01-18 23:48:29terry.reedysetnosy: + terry.reedy, belopolsky, amaury.forgeotdarc, meador.inge, willingc, ezio.melotti
messages: + msg258561
components: + Devguide, - Documentation, Library (Lib), ctypes
2016-01-09 23:56:47eryksunsetcomponents: + Documentation, Library (Lib)
2016-01-09 23:55:18eryksunsetmessages: + msg257876
versions: + Python 3.6, - Python 3.4
2016-01-09 21:28:30kristasetfiles: + create_string_buffer.patch

nosy: + krista
messages: + msg257862

keywords: + patch
2015-08-07 19:54:02tom.pohlsetmessages: + msg248222
2015-08-07 19:41:54tom.pohlsetmessages: + msg248219
components: - Documentation
versions: - Python 2.7, Python 3.5, Python 3.6
2015-08-07 17:54:04eryksunsetpriority: normal -> low

assignee: docs@python
components: + Documentation
versions: + Python 2.7, Python 3.5, Python 3.6
keywords: + easy
nosy: + docs@python, eryksun

messages: + msg248211
stage: needs patch
2015-08-07 10:18:08tom.pohlcreate