Title: Documentation for struct module is out of date in 3.0
Type: Stage:
Components: Documentation Versions: Python 3.0
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: georg.brandl Nosy List: benjamin.peterson, georg.brandl, mgiuca
Priority: normal Keywords: patch

Created on 2008-07-31 14:26 by mgiuca, last changed 2008-07-31 15:09 by mgiuca. This issue is now closed.

File name Uploaded Description Edit
struct-doc.patch mgiuca, 2008-07-31 14:26 Patch with updated documentation
Messages (3)
msg70506 - (view) Author: Matt Giuca (mgiuca) Date: 2008-07-31 14:26
The documentation for the "struct" module still uses the term "string"
even though the struct module itself deals entirely in bytes objects in
Python 3.0.

I propose updating the documentation to reflect the 3.0 terminology.

I've attached a patch for the Docs/library/struct.rst file. It mostly
renames "string" to "bytes". It also notes that pack for 'c', 's' and
'p' accepts either string or bytes, but unpack spits out a bytes.

One important point: If you pass a str to 'c', 's' or 'p', it will get
encoded with UTF-8 before being packed. I've described this behaviour in
the documentation. I'm not sure if this should be described as the
"official" behaviour, or just informatively.

I've traced this behaviour to Modules/_struct.c lines 607, 1650 and 1676
(for 'c', 's' and 'p' respectively), which calls
_PyUnicode_AsDefaultEncodedString. This is found in
Object/unicodeobject.c:1410, which directly calls PyUnicode_EncodeUTF8.

Hence the UTF-8 encoding is not system or locale specific - it will
always happen. However, perhaps we should loosen the documentation to
say "which are encoded using a default encoding scheme".

It would be good if the authors of the struct module read over these
changes first, to make sure I am describing it correctly.

I have also updated Modules/_struct.c's doc strings and exception
messages to reflect this new terminology. (I've changed nothing besides
the contents of these strings - test case passes, just to be safe).

Patch is for /python/branches/py3k/, revision 65324.

Commit Log:

Docs/library/struct.rst: Updated documentation to Python 3.0 terminology
(bytes instead of strings). Added note that packing 'c', 's' or 'p'
accepts either str or bytes.

Modules/_struct.c: Updated doc strings and exception messages to the same.
msg70514 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-07-31 15:03
Thanks for the patch! Done in r65327.
msg70516 - (view) Author: Matt Giuca (mgiuca) Date: 2008-07-31 15:09
Thanks for the props!
Date User Action Args
2008-07-31 15:09:41mgiucasetmessages: + msg70516
2008-07-31 15:03:58benjamin.petersonsetstatus: open -> closed
nosy: + benjamin.peterson
resolution: fixed
messages: + msg70514
2008-07-31 14:26:31mgiucacreate