classification
Title: Merge Doc/ACKS.txt names into Misc/ACKS
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ezio.melotti Nosy List: asvetlov, chris.jerdonek, docs@python, eric.araujo, ezio.melotti, georg.brandl, jcea, loewis, pitrou, python-dev, r.david.murray
Priority: normal Keywords: patch

Created on 2012-07-23 20:07 by chris.jerdonek, last changed 2012-09-13 23:16 by chris.jerdonek. This issue is now closed.

Files
File name Uploaded Description Edit
merge-acks.py chris.jerdonek, 2012-07-24 02:41
issue-15437-sample-output.patch chris.jerdonek, 2012-07-24 02:44 review
issue-15437-script-output-2.patch chris.jerdonek, 2012-08-06 22:31 review
merge-acks-2.py chris.jerdonek, 2012-08-06 22:36
merge-acks-3.py chris.jerdonek, 2012-09-09 20:49
Messages (23)
msg166247 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-23 20:07
This issue is to merge the Doc/ACKS and Misc/ACKS files as discussed here:

http://mail.python.org/pipermail/python-dev/2012-July/121096.html
msg166248 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-23 20:32
I would be happy to prepare a patch.  I can upload a script to this issue that the committer can then run on the latest Misc/ACKS and Doc/ACKS.txt.

The script would preserve the ordering of Misc/ACKS.  It would iterate through the names in Doc/ACKS.txt and insert them in Misc/ACKS at the appropriate location.  Duplicates would not be inserted.
msg166249 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-07-23 20:41
Georg, do you think this is ok for all 3 branches?
msg166251 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-07-23 22:29
This was indeed proposed once or twice before; I can’t search my archive right now but I think I remember Georg saying that he was OK as long as the docs displayed Misc/ACKS.  This means checking the rst syntax of Misc/ACKS and using the right include directive.
msg166260 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-24 02:41
Attached is a script that seems to do the job (except for the rst formatting, which can be added later.  This was so that you can see by the diff what has changed).

In the process of doing this, I found that Jeff McNeil is far out of order in Misc/ACKS, and possibly also Hugo Lopes Tavares and Xavier de Gaye, depending on what alphabetization rules should be used.

The script contains logic to collect the non-ascii characters that appear in people's names, so that non-ascii characters can be approximated by ascii characters for ordering purposes (which seems to be how it is done now in some cases).

In a subsequent comment, I will attach a diff that results from running the script, so you can see what effect it has on Misc/ACKS.
msg166261 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-24 02:44
Attaching sample output of running the script.
msg166281 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-24 12:01
I created a new issue 15439 for including the combined Misc/ACKS into the documentation (as Éric mentioned) because the nature of that discussion is different, and because the changes will be easier to observe and understand if committed separately.
msg166291 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-07-24 13:45
I'm not clear if your script is trying to do this, but there is no way to automatically alphabetize the file.  That's why it says "rough" alphabetic order.  The issue is that different languages alphabetize different letters in different places.  We try to respect the alphabetization of the source language as much as practical...which means there is no algorithm that can do the sorting, since the names in the file do not carry explicit language information.
msg166294 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-07-24 14:02
Well, the script output looks good (apart from a few duplicates which can be resolved by hand, e.g. "Terry Reedy" vs. "Terry J. Reedy").
msg166295 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-24 14:15
I did think through those issues and made a special effort to address them in the script.

For starters, the script does not change the order of any names in Misc/ACKS.  This is to preserve the existing rough alphabetical ordering, and to ensure that the diff consists only of insertions (for easier manual checking, if desired).

As for inserting new names in rough alphabetical order, I dealt with different language characters as follows.  The script has a translation table to map non-ascii characters to ascii characters for sorting purposes.  Currently, that table is as follows (I'm not sure if all of these characters will render on the page):

NON_ASCII = "ÅÉØáäåæçéëíñóôöùúüćęŁńŽКМСабгекнорш“”"
ASCII_SUB = 'AEOaaaaceeinooouuuceLnZKMCabrekhopw""'

This mapping can easily be modified if my initial choices are not the best.  As an early step, the script collects all non-ascii characters that appear in all names to make sure the translation table is up to date (exiting with a message otherwise).

When I said "Jeff McNeil" is out of order, that was because the name appears after "Jeff Epler" but before "Tom Epperly".  The script maintains a list of "out of order" names like this to skip when inserting, to prevent insertions from being out of rough alphabetical order.

If different languages use a different ordering on the word level, the script will not handle that, however.  It only orders lexicographically by last name, and then first name(s).

Much of this information is spelled out in the script's docstring.
msg166296 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-24 14:20
That is correct, Antoine.  Duplicates need to be removed by hand.

To assist in this process, the script currently prints "possible duplicates" to stdout after running.  However, the script could easily be modified to display an in-line indicator before possible duplicates to make this manual step easier, e.g.:

 John Redford
 Terry Reedy
+>>> Terry J. Reedy
 Gareth Rees

Currently, possible duplicates are determined based on whether the last name matches an already existing last name.
msg166298 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-07-24 14:22
> To assist in this process, the script currently prints "possible
> duplicates" to stdout after running.  However, the script could easily
> be modified to display an in-line indicator before possible duplicates
> to make this manual step easier, e.g.:
> 
>  John Redford
>  Terry Reedy
> +>>> Terry J. Reedy
>  Gareth Rees

Well, no need to be perfectionist IMO. The merging will only be done
once (thrice if we count all branches :-)).
msg166321 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-07-24 19:09
Also, if you want to do phonetic translation of non-ASCII, then абгекнор really matches abgeknor, and ш is transliterated to "sh" in English (IIUC) (to "sch" in German).

But I agree that this is best done manually. What matters is what the script produces; the script certainly won't make it into Python's source code. I'm sure Chris had fun writing it.
msg166328 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-24 19:48
Yes, I did. Even though it is throw-away.

By the way, I'm taking Antoine's advice to avoid perfectionism on this. Otherwise I'd include your suggestion re: the special characters. :)
msg166411 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-07-25 16:41
I don't think the docs should display Misc/ACKS. Instead, I propose the following wording

"Many people have contributed to the Python language, the Python standard library, and the Python documentation. See Misc/ACKS in the Python source distribution for a partial list of contributors"

It might be useful to link "Misc/ACKS" to http://hg.python.org/cpython/file/default/Misc/ACKS
(http://hg.python.org/cpython/raw-file/default/Misc/ACKS would be better if hgweb wouldn't declare that application/octet-stream)
msg166420 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-07-25 18:55
We can just use :source:`Misc/ACKS` and it will created a link to hgweb (the colored HTML page, not the raw file).
msg167588 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-08-06 22:31
Is this issue awaiting feedback from anyone else before it can proceed further?  (Just this issue and not issue 15439 to make any adjustments to the docs.)

I am attaching an updated diff after generating the script output again against the tip (modified to prefix matching last names with '>>> ').
msg167589 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-08-06 22:36
For completeness, I am attaching the modified version of the script that was used to generate the latest output.
msg170134 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-09-09 20:49
I was reminded of this issue by the following e-mail today:

http://mail.python.org/pipermail/python-dev/2012-September/121639.html

I updated the script I attached earlier to ensure that it can also be run against the names in 2.7 (attaching now as script #3).  I also checked that this latest script can still be run against 3.2 and default with the names that have been added since the last time I checked.

Let me know if you would like any assistance in how to run the script and what to check for, etc.
msg170425 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-09-13 00:12
Just an FYI that Ezio asked Georg about this issue on IRC yesterday or the day before, and Georg said +1.
msg170462 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-09-13 22:59
New changeset 48185b0f7b8a by Ezio Melotti in branch '3.2':
#15437, #15439: merge Doc/ACKS.txt with Misc/ACKS and modify Doc/about.rst accordingly.
http://hg.python.org/cpython/rev/48185b0f7b8a

New changeset 2b4a89f82485 by Ezio Melotti in branch 'default':
#15437, #15439: merge with 3.2.
http://hg.python.org/cpython/rev/2b4a89f82485

New changeset 76dd082d332e by Ezio Melotti in branch '2.7':
#15437, #15439: merge Doc/ACKS.txt with Misc/ACKS and modify Doc/about.rst accordingly.
http://hg.python.org/cpython/rev/76dd082d332e
msg170465 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-09-13 23:11
Fixed, thanks for the script!
msg170466 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-09-13 23:16
Thanks for committing, Ezio!
History
Date User Action Args
2012-09-13 23:16:39chris.jerdoneksetmessages: + msg170466
2012-09-13 23:11:20ezio.melottisetstatus: open -> closed
messages: + msg170465

assignee: docs@python -> ezio.melotti
resolution: fixed
stage: needs patch -> resolved
2012-09-13 22:59:31python-devsetnosy: + python-dev
messages: + msg170462
2012-09-13 00:12:35chris.jerdoneksetmessages: + msg170425
2012-09-09 20:49:48chris.jerdoneksetfiles: + merge-acks-3.py

messages: + msg170134
2012-08-13 19:32:56chris.jerdoneksetnosy: + asvetlov
2012-08-06 22:36:42chris.jerdoneksetfiles: + merge-acks-2.py

messages: + msg167589
2012-08-06 22:31:29chris.jerdoneksetfiles: + issue-15437-script-output-2.patch

messages: + msg167588
2012-07-25 18:55:23eric.araujosetmessages: + msg166420
2012-07-25 16:41:59loewissetmessages: + msg166411
2012-07-24 19:48:47chris.jerdoneksetmessages: + msg166328
2012-07-24 19:09:41loewissetkeywords: - easy
nosy: + loewis
messages: + msg166321

2012-07-24 14:22:35pitrousetmessages: + msg166298
2012-07-24 14:20:12chris.jerdoneksetmessages: + msg166296
2012-07-24 14:15:03chris.jerdoneksetmessages: + msg166295
2012-07-24 14:02:59pitrousetmessages: + msg166294
2012-07-24 13:45:35r.david.murraysetnosy: + r.david.murray
messages: + msg166291
2012-07-24 12:01:34chris.jerdoneksetmessages: + msg166281
title: Merge Doc/ACKS and Misc/ACKS -> Merge Doc/ACKS.txt names into Misc/ACKS
2012-07-24 11:53:26chris.jerdoneklinkissue15439 dependencies
2012-07-24 02:44:26chris.jerdoneksetfiles: + issue-15437-sample-output.patch
keywords: + patch
messages: + msg166261
2012-07-24 02:41:19chris.jerdoneksetfiles: + merge-acks.py

messages: + msg166260
2012-07-23 23:15:00ezio.melottisetnosy: + ezio.melotti

type: enhancement
stage: needs patch
2012-07-23 23:12:15jceasetnosy: + jcea
2012-07-23 22:29:29eric.araujosetnosy: + eric.araujo
messages: + msg166251
2012-07-23 20:41:36pitrousetnosy: + georg.brandl, pitrou

messages: + msg166249
versions: + Python 2.7, Python 3.2, Python 3.3
2012-07-23 20:32:39chris.jerdoneksetmessages: + msg166248
2012-07-23 20:07:33chris.jerdonekcreate