This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Inconsistent capitalization of proper noun - Unicode.
Type: enhancement Stage: resolved
Components: Documentation, Unicode Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: benjamin.peterson, docs@python, ezio.melotti, lemburg, martin.panter, mdk, miss-islington, r.david.murray, serhiy.storchaka, toonarmycaptain, vstinner
Priority: normal Keywords: patch

Created on 2017-10-26 13:15 by toonarmycaptain, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 4125 merged toonarmycaptain, 2017-10-26 13:15
PR 13194 merged miss-islington, 2019-05-08 16:05
Messages (10)
msg305050 - (view) Author: David Antonini (toonarmycaptain) * Date: 2017-10-26 13:15
Make 'unicode'/'Unicode' capitalization consistent.
'Unicode' is a proper noun, and as such should be capitalized. 

I submitted a pull request correcting the inconsistent capitalization in the Unicode page of the Documentation - Doc/c-api/unicode.rst - capitalizing 12 errant instances to reflect the correct capitalization in most of the document. I was then requested to open an issue here for discussion.
msg305051 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-26 13:41
I agree that the inconsistency should be fixed. But I'm not sure that we should use the words "an Unicode object" in Python 3.

In many similar cases ("a bytes object", "a type object", "a module object") the name of Python type is used. "unicode" was a name of Python type in Python 2. In Python 3 it is "str". The term "Unicode string" also is widely used. Should not we use "a str object", "a string object", "a Unicode string" or "a Unicode string object" in the C API documentation?
msg305053 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-10-26 13:50
It may be a proper noun, but it is conventionally spelled with a lowercase letter when referring to the type/object.  It would be spelled with an upper case letter when referring to the *standard*.
msg305062 - (view) Author: David Antonini (toonarmycaptain) * Date: 2017-10-26 14:59
Does the Unicode documentation currently conform to that convention, or does it require editing? 

It appears to me that a lot of cases where reference to "Unicode object" is currently capitalised (most of them, in fact) may need to be modified. 
However, it would seem that there is a grey area in making a distinction between reference to the unicode type as implemented in Python and reference to the standard as a descriptor of the format of an object? The way I read there a lot of the cases are in essence a reference to both.
msg305068 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-10-26 15:55
In this case I think the cost of editing for consistency may be higher than the value, especially since as you say there are ambiguous cases.
msg329264 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2018-11-04 21:46
Currently in our documentation there's 89 "Unicode obj" vs 8 "unicode obj" so I'll go for it.
msg329267 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2018-11-04 22:02
If you want to do this correctly, you have to check each case:

* if "unicode object" refers to a C PyUnicode object, it's probably better to use "PyUnicode object"
* if "unicode object" refers to a C PyObject object, with type "unicode", it's probably better to leave it as is
* if "unicode object" refers to a Python unicode object, it's probably better to call it "Unicode string object" or just "string object" in Python 3
* if "unicode object" does not indicate whether Python or C is meant, "Unicode object" is probably better
msg341896 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2019-05-08 16:02
New changeset 85225b6a58a516c50c055d5114668ed2fcdcda8c by Julien Palard (toonarmycaptain) in branch 'master':
bpo-31873: Update unicode.rst - 'unicode' capitalization (GH-4125)
https://github.com/python/cpython/commit/85225b6a58a516c50c055d5114668ed2fcdcda8c
msg341899 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2019-05-08 16:07
I merged it, it's a small change that is consistent enough for me with the rest of the file and the doc. Marc-Andre is right though, and if anyone have the courage, the whole doc should be proofread to update accordingly, but let's make it a whole other PR.

Thanks David for your contribution and sorry for the delay.
msg341907 - (view) Author: miss-islington (miss-islington) Date: 2019-05-08 16:34
New changeset ed8860c5af87d78d312ae30dd2d6bedc60bd86e5 by Miss Islington (bot) in branch '3.7':
bpo-31873: Update unicode.rst - 'unicode' capitalization (GH-4125)
https://github.com/python/cpython/commit/ed8860c5af87d78d312ae30dd2d6bedc60bd86e5
History
Date User Action Args
2022-04-11 14:58:53adminsetgithub: 76054
2019-05-08 16:34:15miss-islingtonsetnosy: + miss-islington
messages: + msg341907
2019-05-08 16:07:03mdksetstatus: open -> closed
resolution: fixed
messages: + msg341899

stage: patch review -> resolved
2019-05-08 16:05:07miss-islingtonsetkeywords: + patch
pull_requests: + pull_request13105
2019-05-08 16:02:41mdksetmessages: + msg341896
2018-11-04 22:02:49lemburgsetmessages: + msg329267
2018-11-04 21:46:48mdksetnosy: + mdk
messages: + msg329264
2017-10-26 15:55:51r.david.murraysetmessages: + msg305068
2017-10-26 14:59:10toonarmycaptainsetmessages: + msg305062
2017-10-26 13:50:27r.david.murraysetmessages: + msg305053
2017-10-26 13:41:52serhiy.storchakasetversions: + Python 3.6, Python 3.7
nosy: + vstinner, serhiy.storchaka, martin.panter, ezio.melotti, r.david.murray, lemburg, benjamin.peterson

messages: + msg305051

components: + Unicode
stage: patch review
2017-10-26 13:15:32toonarmycaptaincreate