classification
Title: Builtins documentation refers to old version of UCD.
Type: enhancement Stage: resolved
Components: Documentation, Unicode Versions: Python 3.3, Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: belopolsky, docs@python, eric.araujo, ezio.melotti, georg.brandl, loewis, python-dev, r.david.murray
Priority: normal Keywords: needs review, patch

Created on 2013-06-10 00:15 by belopolsky, last changed 2014-10-10 01:13 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
686836ad3042.diff belopolsky, 2013-06-10 18:46 review
bd092995907c.diff belopolsky, 2013-06-16 17:31 review
Repositories containing patches
https://bitbucket.org/alexander_belopolsky/cpython#issue-18176
Messages (10)
msg190878 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2013-06-10 00:15
Reference to http://www.unicode.org/Public/6.0.0/ucd/extracted/DerivedNumericType.txt in http://docs.python.org/3.4/library/stdtypes.html#numeric-types-int-float-complex should be changed to http://www.unicode.org/Public/6.1.0/ucd/extracted/DerivedNumericType.txt for 3.3 and to http://www.unicode.org/Public/6.2.0/ucd/extracted/DerivedNumericType.txt for 3.4.

Note that the change from 6.1 to 6.2 is immaterial because it did not involve the Nd category, but a change from 6.0 to 6.1 introduced several new ranges:

+110F0..110F9  ; Decimal # Nd  [10] SORA SOMPENG DIGIT ZERO..SORA SOMPENG DIGIT NINE
+11136..1113F  ; Decimal # Nd  [10] CHAKMA DIGIT ZERO..CHAKMA DIGIT NINE
+111D0..111D9  ; Decimal # Nd  [10] SHARADA DIGIT ZERO..SHARADA DIGIT NINE
+116C0..116C9  ; Decimal # Nd  [10] TAKRI DIGIT ZERO..TAKRI DIGIT NINE
msg190928 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2013-06-10 18:51
This is a trivial change, but I would like someone to review this in case there is a better solution to keep this in sync with unicodedata.unidata_version.
msg191236 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-06-15 21:19
If all the versions are up to date it shouldn't be difficult to grep for '6.2.0' at the next update, and I would expect people to do it when it happens.  Maybe adding a comment where unidata_version is defined as a remainder to update the rest will suffice.

That said there might be some rst trick to define constants and use them around but I don't know it off the top of my head and if there is, I'm not sure it can be used in links.
msg191275 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2013-06-16 17:38
Here is what grep revealed:

$ find Doc -name \*.rst | xargs grep -n '6\.2\.0'
Doc/library/stdtypes.rst:357:   See http://www.unicode.org/Public/6.2.0/ucd/extracted/DerivedNumericType.txt
Doc/library/unicodedata.rst:18:this database is compiled from the `UCD version 6.2.0
Doc/library/unicodedata.rst:19:<http://www.unicode.org/Public/6.2.0/ucd>`_.
Doc/library/unicodedata.rst:169:.. [#] http://www.unicode.org/Public/6.2.0/ucd/NameAliases.txt
Doc/library/unicodedata.rst:171:.. [#] http://www.unicode.org/Public/6.2.0/ucd/NamedSequences.txt

I added a note next to UNIDATA_VERSION = "6.2.0" in makeunicodedata.py script. The makeunicodedata.py would be a place to put code that would update the docs automatically, but with only two affected files I don't think this is worth the effort.  Chances are at least unicodedata.rst will benefit from a manual review to reflect any substantive changes in the UCD.
msg191295 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2013-06-16 23:04
I have found another place where explicit UCD version is used in the docs:

Doc/reference/lexical_analysis.rst:729:.. [#] http://www.unicode.org/Public/6.1.0/ucd/NameAliases.txt

I am not sure how this case should be handled.  The language reference was deliberately written so that it avoids mentioning specific version of Unicode.  For example, PropList.txt in the "Identifiers and keywords" section is linked to the location of the latest published PropList.txt: <http://unicode.org/Public/UNIDATA/PropList.txt>.  This means that as of today, all versions of documentation (3.0 through 3.4) refere to Unicode 6.2.0 version of this file.  This may be misleading for the users of the older python versions.

In the same section, there is a reference top a "non-normative HTML file listing all valid identifier characters for Unicode 4.1."

I would suggest that instead of linking to an external resource we generate a similar table (possibly in ReST format) in makeunicodedata.py and include it with documentation.
msg228918 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-10-09 21:33
New changeset fd7994909c2d by R David Murray in branch '3.4':
#18176: updated stdtypes UCD link, added reminder to makeunicodedata.
https://hg.python.org/cpython/rev/fd7994909c2d

New changeset 2551bdfff335 by R David Murray in branch 'default':
Merge: #18176: updated stdtypes UCD link, added reminder to makeunicodedata.
https://hg.python.org/cpython/rev/2551bdfff335
msg228919 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-10-09 21:44
New changeset 303861ce9ead by R David Murray in branch '3.4':
#18176: fix another reference and add it to the makeunicodedata comment.
https://hg.python.org/cpython/rev/303861ce9ead

New changeset e9ec8d622a30 by R David Murray in branch 'default':
Merge: #18176: fix another reference and add it to the makeunicodedata comment.
https://hg.python.org/cpython/rev/e9ec8d622a30
msg228921 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-10-09 21:46
I committed the stdtypes, lexical_analysis, and Tools changes, updating the version number appropriately in the doc fixes.

I'm closing this, and will open a new issue for the PropList.txt problem.
msg228934 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-10-10 00:48
New changeset 73a6f121e51a by R David Murray in branch '3.4':
#18176: Change generic UCD PropList link to version specific link.
https://hg.python.org/cpython/rev/73a6f121e51a

New changeset b04b7af14910 by R David Murray in branch 'default':
Merge: #18176: Change generic UCD PropList link to version specific link.
https://hg.python.org/cpython/rev/b04b7af14910
msg228936 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-10-10 01:13
I changed my mind and decided to "fix" the PropList reference to also be version specific, since that solves the immediate problem.  I opened a new issue for automating the changes, since there are three locations and four URLs, now.
History
Date User Action Args
2014-10-10 01:13:37r.david.murraysetmessages: + msg228936
2014-10-10 00:48:00python-devsetmessages: + msg228934
2014-10-09 21:46:45r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg228921

resolution: fixed
stage: commit review -> resolved
2014-10-09 21:44:19python-devsetmessages: + msg228919
2014-10-09 21:33:47python-devsetnosy: + python-dev
messages: + msg228918
2013-06-16 23:04:00belopolskysetnosy: + loewis
messages: + msg191295
2013-06-16 17:38:42belopolskysetmessages: + msg191275
2013-06-16 17:31:50belopolskysetfiles: + bd092995907c.diff
2013-06-15 21:19:38ezio.melottisetnosy: + georg.brandl, eric.araujo
type: enhancement
messages: + msg191236
2013-06-10 21:22:51serhiy.storchakasetnosy: + ezio.melotti
components: + Documentation, Unicode
2013-06-10 18:51:28belopolskysetnosy: + docs@python
messages: + msg190928

assignee: docs@python
keywords: + needs review
stage: commit review
2013-06-10 18:46:33belopolskysetfiles: + 686836ad3042.diff
keywords: + patch
2013-06-10 18:44:32belopolskysethgrepos: + hgrepo197
2013-06-10 00:15:03belopolskycreate