This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Standardize documentation whitespace, formatting
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.6, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: berker.peksag, docs@python, georg.brandl, jedwards, martin.panter, python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2015-04-12 04:58 by jedwards, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
docsdiff.diff jedwards, 2015-04-12 04:58 review
docsdiff.r2.diff jedwards, 2015-04-14 19:59 review
Messages (14)
msg240540 - (view) Author: James Edwards (jedwards) * Date: 2015-04-12 04:58
I realize this is a huge patch, I'd be happy to split it to multiple little 
patches (one per file, one per documentation directory, etc.) to make things 
easier.  Just let me know.

The patch attempts to do a few things (with exceptions, as noted below):

    1. Standardize leading whitespace in block bodies (4 spaces per level)
    2. Enforce >= 2 spaces between code and inline comment
    3. In interactive interpreter snippets, ensures lines meant to be entered
        are preceded by '>>> ' or '... '
    4. Inline whitespace standardization (following a liberal / lenient 
        (not strict) PEP-8 testing)

Scanning the documentation, extracting the code snippets, and testing them for
PEP-8 compliance was done with a script, but all adjustments were done by hand.

This means that there remain some code that fails the lenient PEP-8 checks that
I used.  For example, code at [1] ("How to write obfuscated one liners") 
obviously threw errors, but because of the context of the snippet and the use it 
serves in the documentation, it was left it unchanged.

Similarly, there are (intentionally) poorly formatted snippets in [2] ("Lexical
Analysis") that were also left unchanged.

Since these changes were applied by hand, I was able to ignore situations where
things that would normally raise warnings.  I erred on the side of leaving the
documentation examples unchanged, and strived to make only innocuous changes.

I made no attempt to change the functionality or semantics of any of the 
snippets.  The only changes I made were "harmless" formatting.  None of the
changes will affect the function or output of the snippets.

What the changes do, however, is give a consistency to the documentation that
will allow readers to become more comfortable with the structure of the language
and improve readability.

[1] https://docs.python.org/3/faq/programming.html#is-it-possible-to-write-obfuscated-one-liners-in-python
[2] https://docs.python.org/3/reference/lexical_analysis.html#indentation
[3] http://pep8.readthedocs.org/en/latest/intro.html

In addition to the checks that are ignored by default by the pep8[3] module for 
not being unanimously accepted:

    E121    # continuation line under-indented for hanging indent
    E123    # closing bracket does not match indentation of opening bracket’s line
    E126    # continuation line over-indented for hanging indent
    E133    # closing bracket is missing indentation
    E226    # missing whitespace around arithmetic operator
    E241    # multiple spaces after ‘,’ 
    E242    # tab after ‘,’
    E704    # multiple statements on one line (def)

The following checks were "globally" ignored in my script, though many others 
were conditionally ignored by the script or by myself.

    ignore.append('W292')       # no newline at end of file
    ignore.append('E302')       # expected 2 blank lines, found 0
    ignore.append('E401')       # multiple imports on one line
    ignore.append('W391')       # blank line at end of file
    ignore.append('E231')       # missing whitespace after ','
    ignore.append('E114')       # indentation is not multiple of four (comment)
    ignore.append('E116')       # unexpected indentation (comment)
    
While the patch diffstat is:

    67 files changed, 450 insertions(+), 412 deletions(-)

Ignoring all whitespace changes, the diffstat is only:

    10 files changed, 118 insertions(+), 114 deletions(-)

The majority of these remaining changes fix the inconsistency of interactive interpreter snippets, where, within the same snippet, some lines have '>>> ' while others are missing it, or the '... ' continuation prompt.

Let me know if you need anything from me (such as splitting this patch up) to
help get it merged.  Thanks.
msg240547 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2015-04-12 09:23
Patch looks good to me. Thanks! :)

I left a couple of comments on Rietveld: http://bugs.python.org/review/23921/

We can probably ignore the following type of changes too:

-      if pid == 0: # In a child process
+      if pid == 0:  # In a child process

-          status = '200 OK' # HTTP Status
-          headers = [('Content-type', 'text/plain')] # HTTP Headers
+          status = '200 OK'  # HTTP Status
+          headers = [('Content-type', 'text/plain')]  # HTTP Headers

-   dom1 = parse('c:\\temp\\mydata.xml') # parse an XML file by name
+   dom1 = parse('c:\\temp\\mydata.xml')  # parse an XML file by name

Also, perhaps we can add your script to Tools/scripts and integrate it with the "make patchcheck" command.
msg240557 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-04-12 14:17
I will have comments on this, but it may be a bit before I get to it.  In general I'm not sure of the value, though in the little bit I scanned so far there are at least a few changes that look worthwhile.
msg240826 - (view) Author: James Edwards (jedwards) * Date: 2015-04-14 01:40
Thanks Berker, I responded to most of your comments in rietveld.

A few of your comments suggested "we should get rid of X", and while I can't say I disagree, I really tried to limit the scope of the changes to whitespace and formatting.

As far as the script goes, it still requires significant human interaction.

Of the 1983 code snippets it extracts[1], 346 still raise warnings and require human verification.  The majority of these snippets aren't actually intended to be Python code.  Some of the snippets are C code, others are optparse/argparse command line examples, configparser config files, or just sections where the author wanted to use a literal block in the documentation.[2]

Of course, when the script triggers on one of these blocks, I ignore it.

Eventually, my goal was to have the script also verify interactive code blocks, much like doctest, but I thought a good intermediary step was to unify the whitespace.

As for the value, I understand the concern one might have before embarking on the process of standardizing the whitespace and formatting -- but given that it's mostly done and comes neatly packaged in a patch, I'm not sure I understand any remaining concerns.  In other words, if it's not something you would spend time on personally, because you have better ways to contribute/ spend your time, I absolutely understand that.  But given that the majority of the leg work is done by the patch, I hope you'll consider merging the changes.

In general the goal was to make the snippets, especially in the tutorial-type sections consistent, to allow the reader to become comfortable with the language instead of being distracted by varied formatting and to facilitate readability.  I hope this explains the motivation behind the effort.

[1] Some files were ignored from future runs after the snippets were validated, because they contained too many false positives, e.g. ctypes.rst.  So this count doesn't reflect the blocks in those ignored files.

[2] I wonder if explicitly indicating which rst literal blocks are intended to be python code would be of some benefit.  For example, adding such a tag could trigger the automatic formatting checks and, for interactive blocks, run the automatic doctest-type verification I was envisioning.  I may think more about the feasibility of this and bring it to python-ideas/python-dev.
msg240885 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-04-14 13:45
The patch mostly LGTM. Most changes are trivial, but the patch also fixes few bugs. It is nice that the patch adds empty lines after blocks, this makes copying the code to interactive interpreter easier. I left comments on Rietveld.

The most questionable changes are adding spaces before comments. But these changes are simple and can slightly increase readability.
msg240984 - (view) Author: James Edwards (jedwards) * Date: 2015-04-14 18:39
Thanks for the review Serhiy.

I'll prepare a revised patchset, given the comments from you and Berker and have it uploaded today.

Thanks again.
msg241003 - (view) Author: James Edwards (jedwards) * Date: 2015-04-14 19:59
Attaching revised patch per reviews.

Notable changes:

  * Reverted howto/curses.rst multiple inline statements -> multi-target 
    assignment (curses.rst is now unchanged)

  * Reverted library/subprocess.rst "==>" to "# ==>" changes since there are 
    other outstanding issues in the file.  (subprocess.rst is now unchanged)
msg257666 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-01-07 01:44
I do prefer extra space before a comment, otherwise it looks too much like some kind of “x # y” binary operator. So I think those changes are worthwhile.

I left some review comments, mainly minor suggestions. Most of the other changes look worthwhile. Others I don’t mind either way, but I can understand others may prefer them, such as lining up hanging indents for fixed-width fonts.

David: did you end up looking this? Are there any particular bits you think are not worth it?
msg257668 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-01-07 04:13
Sorry, I'm not going to have time to look at it.
msg261044 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-03-01 08:34
James, could you please update your patch to address Martin's comments? In general the patch LGTM and I hope to commit it in short time.
msg261098 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2016-03-02 07:04
I had no idea we had so many cases of wonky indentation :)

This looks good to me, apart from a few comments I left on rietveld.
msg265231 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-05-10 09:02
New changeset 2b492ea961c1 by Serhiy Storchaka in branch '3.5':
Issue #23921: Standardized documentation whitespace formatting.
https://hg.python.org/cpython/rev/2b492ea961c1

New changeset 909099686e6e by Serhiy Storchaka in branch 'default':
Issue #23921: Standardized documentation whitespace formatting.
https://hg.python.org/cpython/rev/909099686e6e
msg265233 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-05-10 10:46
New changeset be85897e6d58 by Serhiy Storchaka in branch '2.7':
Issue #23921: Standardized documentation whitespace formatting.
https://hg.python.org/cpython/rev/be85897e6d58
msg265234 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-10 10:48
Addressed Martin's Georg's and my comments, resolved conflicts, stripped trailing spaces, backported to 2.7.
History
Date User Action Args
2022-04-11 14:58:15adminsetgithub: 68109
2016-05-10 10:48:40serhiy.storchakasetstatus: open -> closed
versions: + Python 2.7
messages: + msg265234

resolution: fixed
stage: patch review -> resolved
2016-05-10 10:46:11python-devsetmessages: + msg265233
2016-05-10 09:02:53python-devsetnosy: + python-dev
messages: + msg265231
2016-05-10 08:37:27serhiy.storchakasetassignee: docs@python -> serhiy.storchaka
2016-03-02 07:04:32georg.brandlsetmessages: + msg261098
2016-03-01 08:34:19serhiy.storchakasetnosy: + georg.brandl

messages: + msg261044
versions: + Python 3.6, - Python 3.4
2016-01-07 04:14:17r.david.murraysetnosy: - r.david.murray
2016-01-07 04:13:58r.david.murraysetnosy: + r.david.murray
messages: + msg257668
2016-01-07 04:12:49r.david.murraysetnosy: - r.david.murray
2016-01-07 01:44:45martin.pantersetnosy: + martin.panter
messages: + msg257666
2015-04-14 19:59:59jedwardssetfiles: + docsdiff.r2.diff

messages: + msg241003
2015-04-14 18:39:54jedwardssetmessages: + msg240984
2015-04-14 13:45:41serhiy.storchakasetmessages: + msg240885
2015-04-14 01:40:25jedwardssetmessages: + msg240826
2015-04-12 14:17:06r.david.murraysetmessages: + msg240557
2015-04-12 09:23:18berker.peksagsetnosy: + berker.peksag

messages: + msg240547
versions: + Python 3.4, Python 3.5, - Python 3.6
2015-04-12 05:11:31serhiy.storchakasetnosy: + r.david.murray, serhiy.storchaka

stage: patch review
2015-04-12 04:59:28jedwardssetversions: + Python 3.6
2015-04-12 04:58:58jedwardscreate