This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: document what codec PyMemberDef T_STRING decodes the char * as
Type: behavior Stage: commit review
Components: Documentation Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Windson Yang, docs@python, gregory.p.smith, josh.r, lys.nikolaou, miss-islington
Priority: normal Keywords: easy, patch, patch, patch

Created on 2015-10-19 06:42 by gregory.p.smith, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 10580 merged Windson Yang, 2018-11-17 03:34
PR 10580 merged Windson Yang, 2018-11-17 03:34
PR 10580 merged Windson Yang, 2018-11-17 03:34
PR 10586 merged miss-islington, 2018-11-17 19:17
PR 10587 merged miss-islington, 2018-11-17 19:17
Messages (8)
msg253172 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2015-10-19 06:42
https://docs.python.org/3/c-api/structures.html#c.PyMemberDef

T_STRING members are turned into str objects in Python.  The documentation needs updating to mention which codec the char * bytes are treated as.

Solving this issue involves code inspection and leaving pointers to that code here in the issue, then updating the docs to mention the requirements for the char * member data as well as what happens upon assignment for non-READONLY T_STRING data (a different restriction?  or encoding to the same codec?)

My _guess_ would be UTF-8 or ASCII but I'll let someone else dive in and find out.  This is a Python 3 specific documentation clarification.
msg253275 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2015-10-21 03:22
Checking the source ( https://hg.python.org/cpython/file/tip/Python/structmember.c#l51 ), it calls PyUnicodeFromString ( https://docs.python.org/3/c-api/unicode.html?highlight=pyunicode_fromstring#c.PyUnicode_FromString ), so it's always interpreted as UTF-8.
msg329773 - (view) Author: Lysandros Nikolaou (lys.nikolaou) * (Python committer) Date: 2018-11-12 23:34
It's been more than 3 years, since this was opened, but I will ask nevertheless. Should a PR maybe made for this issue?
msg329774 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2018-11-12 23:44
it still seems relevant, having better docs is always good.
msg329794 - (view) Author: Windson Yang (Windson Yang) * Date: 2018-11-13 01:56
I will work on it today :D
msg330032 - (view) Author: miss-islington (miss-islington) Date: 2018-11-17 19:16
New changeset 689d555ec135d4115574addd063c358ac4897cc4 by Miss Islington (bot) (Windson yang) in branch 'master':
bpo-25438: document what codec PyMemberDef T_STRING decodes the char * as (GH-10580)
https://github.com/python/cpython/commit/689d555ec135d4115574addd063c358ac4897cc4
msg330033 - (view) Author: miss-islington (miss-islington) Date: 2018-11-17 19:50
New changeset d1a97b36595726074a83452e5c476806936becba by Miss Islington (bot) in branch '3.7':
[3.7] bpo-25438: document what codec PyMemberDef T_STRING decodes the char * as (GH-10580) (GH-10586)
https://github.com/python/cpython/commit/d1a97b36595726074a83452e5c476806936becba
msg330034 - (view) Author: miss-islington (miss-islington) Date: 2018-11-17 19:50
New changeset 8945017be4cc9527767bb66673e73e28e4b0b4d3 by Miss Islington (bot) in branch '3.6':
[3.6] bpo-25438: document what codec PyMemberDef T_STRING decodes the char * as (GH-10580) (GH-10587)
https://github.com/python/cpython/commit/8945017be4cc9527767bb66673e73e28e4b0b4d3
History
Date User Action Args
2022-04-11 14:58:22adminsetgithub: 69624
2018-11-17 19:51:01gregory.p.smithsetkeywords: patch, patch, patch, easy
status: open -> closed
stage: patch review -> commit review
resolution: fixed
versions: + Python 3.7, Python 3.8, - Python 3.4, Python 3.5
2018-11-17 19:50:28miss-islingtonsetmessages: + msg330034
2018-11-17 19:50:01miss-islingtonsetmessages: + msg330033
2018-11-17 19:17:19miss-islingtonsetpull_requests: + pull_request9831
2018-11-17 19:17:09miss-islingtonsetpull_requests: + pull_request9830
2018-11-17 19:16:53miss-islingtonsetnosy: + miss-islington
messages: + msg330032
2018-11-17 03:34:53Windson Yangsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request9825
2018-11-17 03:34:48Windson Yangsetkeywords: + patch
stage: needs patch -> needs patch
pull_requests: + pull_request9826
2018-11-17 03:34:44Windson Yangsetkeywords: + patch
stage: needs patch -> needs patch
pull_requests: + pull_request9824
2018-11-13 01:56:09Windson Yangsetnosy: + Windson Yang
messages: + msg329794
2018-11-12 23:44:13gregory.p.smithsetmessages: + msg329774
2018-11-12 23:34:30lys.nikolaousetnosy: + lys.nikolaou
messages: + msg329773
2015-10-21 03:22:07josh.rsetnosy: + josh.r
messages: + msg253275
2015-10-19 06:42:04gregory.p.smithcreate