Issue 23921: Standardize documentation whitespace, formatting

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/68109

classification

Title:	Standardize documentation whitespace, formatting
Type:	enhancement	Stage:	resolved
Components:	Documentation	Versions:	Python 3.6, Python 3.5, Python 2.7

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	serhiy.storchaka	Nosy List:	berker.peksag, docs@python, georg.brandl, jedwards, martin.panter, python-dev, serhiy.storchaka
Priority:	normal	Keywords:	patch

Created on 2015-04-12 04:58 by jedwards, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
docsdiff.diff	jedwards, 2015-04-12 04:58		review
docsdiff.r2.diff	jedwards, 2015-04-14 19:59		review

Messages (14)
msg240540 - (view)	Author: James Edwards (jedwards) *	Date: 2015-04-12 04:58
I realize this is a huge patch, I'd be happy to split it to multiple little patches (one per file, one per documentation directory, etc.) to make things easier. Just let me know. The patch attempts to do a few things (with exceptions, as noted below): 1. Standardize leading whitespace in block bodies (4 spaces per level) 2. Enforce >= 2 spaces between code and inline comment 3. In interactive interpreter snippets, ensures lines meant to be entered are preceded by '>>> ' or '... ' 4. Inline whitespace standardization (following a liberal / lenient (not strict) PEP-8 testing) Scanning the documentation, extracting the code snippets, and testing them for PEP-8 compliance was done with a script, but all adjustments were done by hand. This means that there remain some code that fails the lenient PEP-8 checks that I used. For example, code at [1] ("How to write obfuscated one liners") obviously threw errors, but because of the context of the snippet and the use it serves in the documentation, it was left it unchanged. Similarly, there are (intentionally) poorly formatted snippets in [2] ("Lexical Analysis") that were also left unchanged. Since these changes were applied by hand, I was able to ignore situations where things that would normally raise warnings. I erred on the side of leaving the documentation examples unchanged, and strived to make only innocuous changes. I made no attempt to change the functionality or semantics of any of the snippets. The only changes I made were "harmless" formatting. None of the changes will affect the function or output of the snippets. What the changes do, however, is give a consistency to the documentation that will allow readers to become more comfortable with the structure of the language and improve readability. [1] https://docs.python.org/3/faq/programming.html#is-it-possible-to-write-obfuscated-one-liners-in-python [2] https://docs.python.org/3/reference/lexical_analysis.html#indentation [3] http://pep8.readthedocs.org/en/latest/intro.html In addition to the checks that are ignored by default by the pep8[3] module for not being unanimously accepted: E121 # continuation line under-indented for hanging indent E123 # closing bracket does not match indentation of opening bracket’s line E126 # continuation line over-indented for hanging indent E133 # closing bracket is missing indentation E226 # missing whitespace around arithmetic operator E241 # multiple spaces after ‘,’ E242 # tab after ‘,’ E704 # multiple statements on one line (def) The following checks were "globally" ignored in my script, though many others were conditionally ignored by the script or by myself. ignore.append('W292') # no newline at end of file ignore.append('E302') # expected 2 blank lines, found 0 ignore.append('E401') # multiple imports on one line ignore.append('W391') # blank line at end of file ignore.append('E231') # missing whitespace after ',' ignore.append('E114') # indentation is not multiple of four (comment) ignore.append('E116') # unexpected indentation (comment) While the patch diffstat is: 67 files changed, 450 insertions(+), 412 deletions(-) Ignoring all whitespace changes, the diffstat is only: 10 files changed, 118 insertions(+), 114 deletions(-) The majority of these remaining changes fix the inconsistency of interactive interpreter snippets, where, within the same snippet, some lines have '>>> ' while others are missing it, or the '... ' continuation prompt. Let me know if you need anything from me (such as splitting this patch up) to help get it merged. Thanks.
msg240547 - (view)	Author: Berker Peksag (berker.peksag) *	Date: 2015-04-12 09:23
Patch looks good to me. Thanks! :) I left a couple of comments on Rietveld: http://bugs.python.org/review/23921/ We can probably ignore the following type of changes too: - if pid == 0: # In a child process + if pid == 0: # In a child process - status = '200 OK' # HTTP Status - headers = [('Content-type', 'text/plain')] # HTTP Headers + status = '200 OK' # HTTP Status + headers = [('Content-type', 'text/plain')] # HTTP Headers - dom1 = parse('c:\\temp\\mydata.xml') # parse an XML file by name + dom1 = parse('c:\\temp\\mydata.xml') # parse an XML file by name Also, perhaps we can add your script to Tools/scripts and integrate it with the "make patchcheck" command.
msg240557 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2015-04-12 14:17
I will have comments on this, but it may be a bit before I get to it. In general I'm not sure of the value, though in the little bit I scanned so far there are at least a few changes that look worthwhile.
msg240826 - (view)	Author: James Edwards (jedwards) *	Date: 2015-04-14 01:40
Thanks Berker, I responded to most of your comments in rietveld. A few of your comments suggested "we should get rid of X", and while I can't say I disagree, I really tried to limit the scope of the changes to whitespace and formatting. As far as the script goes, it still requires significant human interaction. Of the 1983 code snippets it extracts[1], 346 still raise warnings and require human verification. The majority of these snippets aren't actually intended to be Python code. Some of the snippets are C code, others are optparse/argparse command line examples, configparser config files, or just sections where the author wanted to use a literal block in the documentation.[2] Of course, when the script triggers on one of these blocks, I ignore it. Eventually, my goal was to have the script also verify interactive code blocks, much like doctest, but I thought a good intermediary step was to unify the whitespace. As for the value, I understand the concern one might have before embarking on the process of standardizing the whitespace and formatting -- but given that it's mostly done and comes neatly packaged in a patch, I'm not sure I understand any remaining concerns. In other words, if it's not something you would spend time on personally, because you have better ways to contribute/ spend your time, I absolutely understand that. But given that the majority of the leg work is done by the patch, I hope you'll consider merging the changes. In general the goal was to make the snippets, especially in the tutorial-type sections consistent, to allow the reader to become comfortable with the language instead of being distracted by varied formatting and to facilitate readability. I hope this explains the motivation behind the effort. [1] Some files were ignored from future runs after the snippets were validated, because they contained too many false positives, e.g. ctypes.rst. So this count doesn't reflect the blocks in those ignored files. [2] I wonder if explicitly indicating which rst literal blocks are intended to be python code would be of some benefit. For example, adding such a tag could trigger the automatic formatting checks and, for interactive blocks, run the automatic doctest-type verification I was envisioning. I may think more about the feasibility of this and bring it to python-ideas/python-dev.
msg240885 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2015-04-14 13:45
The patch mostly LGTM. Most changes are trivial, but the patch also fixes few bugs. It is nice that the patch adds empty lines after blocks, this makes copying the code to interactive interpreter easier. I left comments on Rietveld. The most questionable changes are adding spaces before comments. But these changes are simple and can slightly increase readability.
msg240984 - (view)	Author: James Edwards (jedwards) *	Date: 2015-04-14 18:39
Thanks for the review Serhiy. I'll prepare a revised patchset, given the comments from you and Berker and have it uploaded today. Thanks again.
msg241003 - (view)	Author: James Edwards (jedwards) *	Date: 2015-04-14 19:59
Attaching revised patch per reviews. Notable changes: * Reverted howto/curses.rst multiple inline statements -> multi-target assignment (curses.rst is now unchanged) * Reverted library/subprocess.rst "==>" to "# ==>" changes since there are other outstanding issues in the file. (subprocess.rst is now unchanged)
msg257666 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-01-07 01:44
I do prefer extra space before a comment, otherwise it looks too much like some kind of “x # y” binary operator. So I think those changes are worthwhile. I left some review comments, mainly minor suggestions. Most of the other changes look worthwhile. Others I don’t mind either way, but I can understand others may prefer them, such as lining up hanging indents for fixed-width fonts. David: did you end up looking this? Are there any particular bits you think are not worth it?
msg257668 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2016-01-07 04:13
Sorry, I'm not going to have time to look at it.
msg261044 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-03-01 08:34
James, could you please update your patch to address Martin's comments? In general the patch LGTM and I hope to commit it in short time.
msg261098 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2016-03-02 07:04
I had no idea we had so many cases of wonky indentation :) This looks good to me, apart from a few comments I left on rietveld.
msg265231 - (view)	Author: Roundup Robot (python-dev)	Date: 2016-05-10 09:02
New changeset 2b492ea961c1 by Serhiy Storchaka in branch '3.5': Issue #23921: Standardized documentation whitespace formatting. https://hg.python.org/cpython/rev/2b492ea961c1 New changeset 909099686e6e by Serhiy Storchaka in branch 'default': Issue #23921: Standardized documentation whitespace formatting. https://hg.python.org/cpython/rev/909099686e6e
msg265233 - (view)	Author: Roundup Robot (python-dev)	Date: 2016-05-10 10:46
New changeset be85897e6d58 by Serhiy Storchaka in branch '2.7': Issue #23921: Standardized documentation whitespace formatting. https://hg.python.org/cpython/rev/be85897e6d58
msg265234 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-05-10 10:48
Addressed Martin's Georg's and my comments, resolved conflicts, stripped trailing spaces, backported to 2.7.

History
Date	User	Action	Args
2022-04-11 14:58:15	admin	set	github: 68109
2016-05-10 10:48:40	serhiy.storchaka	set	status: open -> closed versions: + Python 2.7 messages: + msg265234 resolution: fixed stage: patch review -> resolved
2016-05-10 10:46:11	python-dev	set	messages: + msg265233
2016-05-10 09:02:53	python-dev	set	nosy: + python-dev messages: + msg265231
2016-05-10 08:37:27	serhiy.storchaka	set	assignee: docs@python -> serhiy.storchaka
2016-03-02 07:04:32	georg.brandl	set	messages: + msg261098
2016-03-01 08:34:19	serhiy.storchaka	set	nosy: + georg.brandl messages: + msg261044 versions: + Python 3.6, - Python 3.4
2016-01-07 04:14:17	r.david.murray	set	nosy: - r.david.murray
2016-01-07 04:13:58	r.david.murray	set	nosy: + r.david.murray messages: + msg257668
2016-01-07 04:12:49	r.david.murray	set	nosy: - r.david.murray
2016-01-07 01:44:45	martin.panter	set	nosy: + martin.panter messages: + msg257666
2015-04-14 19:59:59	jedwards	set	files: + docsdiff.r2.diff messages: + msg241003
2015-04-14 18:39:54	jedwards	set	messages: + msg240984
2015-04-14 13:45:41	serhiy.storchaka	set	messages: + msg240885
2015-04-14 01:40:25	jedwards	set	messages: + msg240826
2015-04-12 14:17:06	r.david.murray	set	messages: + msg240557
2015-04-12 09:23:18	berker.peksag	set	nosy: + berker.peksag messages: + msg240547 versions: + Python 3.4, Python 3.5, - Python 3.6
2015-04-12 05:11:31	serhiy.storchaka	set	nosy: + r.david.murray, serhiy.storchaka stage: patch review
2015-04-12 04:59:28	jedwards	set	versions: + Python 3.6
2015-04-12 04:58:58	jedwards	create