Title: Sphinx incompatible markup in the standard library docstrings
Type: Stage: resolved
Components: Documentation Versions: Python 3.7, Python 3.6, Python 3.5, Python 2.7
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Patrick Lehmann, docs@python, georg.brandl, lukasz.langa, r.david.murray, rhettinger, serhiy.storchaka, terry.reedy
Priority: normal Keywords: easy, patch

Created on 2016-11-16 00:59 by Patrick Lehmann, last changed 2018-06-18 21:04 by terry.reedy. This issue is now closed.

File name Uploaded Description Edit
docstring_markup.patch Patrick Lehmann, 2016-11-16 23:23 markup changes in docstrings
Messages (28)
msg280907 - (view) Author: Patrick Lehmann (Patrick Lehmann) * Date: 2016-11-16 00:59
Why does e.g. configparser.ConfigParser contain doc strings with Sphinx incompatible markup?

The markup starts with back-tick, but ends with a single quote.


Sphinx writes these messages:
D:\git\PoC\py\lib\ExtendedConfigParser\ of lib.ExtendedConfigParser.ExtendedConfigParser.read_file:3: WARNING: Inline interpreted text or phrase reference start-str
ing without end-string.

Note: ExtendedConfigParser is class derived from configparser.ConfigParser. Inherited methods get documented too.

Btw. I have some improvements for this class, how can I contribute them? Who is the maintainer for this class? Please contact me:

The improved version is available at GitHub:
msg281005 - (view) Author: Patrick Lehmann (Patrick Lehmann) * Date: 2016-11-16 22:50
How can I supply a fix?

I have a branch with lots of fixes.

Why don't you accept pull requests via GitHub?

Kind regards
msg281007 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-11-16 23:00
We will accept github pull requests in the future (the transition is underway).

For now, you can create a diff file (using hg diff by preference, but git diff will work) and uploaded it to the issue.  For this issue, please only upload the docstring changes.  For other enhancement suggestions, open separate issues.  (This would be true even if we were accepting pull requests).

The existing docstring markup is probably a remnant of the days when the documentation was written in LaTeX.

Lukasz Langa is the current maintainer of this module.
msg281010 - (view) Author: Patrick Lehmann (Patrick Lehmann) * Date: 2016-11-16 23:23
Here is the patch file created with:
PS> git diff > docstring_markup.patch

This patchfile effects all files with this markup in the CPython repository.

Kind regards
msg281162 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-11-18 20:16
As far as I looked, the patch changes `xyz' in docstrings and quotes to ``xyz``.  A rst expert should verify that this is correct.  In printed strings, `zyz' is changed to 'xyz', which I consider to be correct.

Before applying this, I would want to review in Rietveld, with side by side diff and changes color marked.  However, Rietveld does not like the patch and there is no 'review' button.  I thought it might be the git format, but I found another git patch that did have a review button.  When I downloaded and tried to apply (to default), hg says that there is no diff.  I don't see the problem, but we cannot currently use a patch that does not apply in hg.

Patrick, in order to use patches, we require Contributor Agreements.
There is an electronic form that makes submission easy.
msg281165 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-11-18 20:36
I think that we do not generally use ReST markup in our docstrings.  So replacing `x' with 'x' would be more correct, I think.  In many cases the quotes could just be omitted entirely.

The patch command says:

  patch unexpectedly ends in middle of line
  patch: **** Only garbage was found in the patch input.
msg281197 - (view) Author: Patrick Lehmann (Patrick Lehmann) * Date: 2016-11-19 03:30

I used this regexp on all files:
match pattern: `([A-Za-z0-9_]+)'
replace pattern ``\1``
I assumed that only identifiers where quoted in such way. I think my editor found around 139 matches in the whole CPython repository.

I found some of these markup in non docstring strings, which I reverted as far as I found them by manually reviewing all changed files.

For a colored diff, see my Git branch:
There is also a PR-, commit-, and line-based comment feature box GitHub.

How you solve it is up to you, but I would like to get rid of hundreds of warnings in my Sphinx runs, when modules are inherting code (and docstrings) from Python.

Kind regards
msg281198 - (view) Author: Patrick Lehmann (Patrick Lehmann) * Date: 2016-11-19 03:39
.... I signed the CLA.
msg281261 - (view) Author: Patrick Lehmann (Patrick Lehmann) * Date: 2016-11-20 11:10
I also found some docstrings using double back-tick plus double single quotes.

For example: ``x.y = v'' in in function setattr(...).
msg284206 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2016-12-28 23:19
Marking as a stdlib-wide issue since it's not configparser-specific.

FYI, as you probably noticed, the `term' syntax predates Sphinx and is used in lots of places. While I think it would be nice to fix it, it's a big diff.
msg319833 - (view) Author: Patrick Lehmann (Patrick Lehmann) * Date: 2018-06-17 22:23
Any progress on that issue?

1.5 years passed by and it should be possible to fix the Python documentation in that time, right?
msg319837 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2018-06-17 23:53
Thanks for coming back to this.

We're accepting PRs via github now, so the next step now would be to make it into a PR.  

Sometimes things just get forgotten and you have to nudge them to get them moving (see the devguide for guidelines about when it is appropriate to nudge an issue).
msg319838 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-06-18 00:33
I agree with David about replacing `x' with 'x' instead of ``x``, so please make this change.  Do you know what ``x.y = v'' is supposed to mean?

David, do you want one PR with 139+ changes, or should we start with fewer?  Patrick, are these usages scattered over numerous files or concentrated in a few?
msg319839 - (view) Author: Patrick Lehmann (Patrick Lehmann) * Date: 2018-06-18 00:41
Against what branch should I create the PR?

I was a huge number of changes.
I think I'll create multiple PRs to ease the review.
msg319845 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-06-18 01:08
master, and maybe backport from there, unless the PR never touches master.

I wonder if we should bring this up on pydev.  Without discussion and agreement, someone might object just because there are so many changes.  Or someone might say that they want the 'illegal' markup kept.

I see that the one docstring you linked to,
had 9 occurences, 1 for each parameter.

I never see any warnings.  It must be that our build system somewhere discards them.
msg319847 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-06-18 01:39
This should probably be discussed on python-dev.  In particular, changing from single backticks to double backticks is something that makes Sphinx happy but will uglify the docstrings for all other ways of viewing the docs.

Also, note that most of our docs aren't autogenerated from the docstrings so we've never had any particular need to impose Sphinx markup requirements for docstrings.
msg319848 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-06-18 01:45
In my last comment above, I forgot that this issue is about *docstrings*.  We do not officially process docstrings with Sphinx, so there are no warnings to be suppressed.

PEP 8, which covers style for the stdlib, refers to
Neither says anything about markup and last I remember, there should not be any for the stdlib. Pep-0257 gives this example:

def complex(real=0.0, imag=0.0):
    """Form a complex number.

    Keyword arguments:
    real -- the real part (default 0.0)
    imag -- the imaginary part (default 0.0)
Here, parameter names are indicated by the formatting, not by markup.  If `' is used in such lists, it should just be deleted.  I believe 'name' is sometimes used in running text.

The help() function prints a docstring as is.
msg319892 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2018-06-18 15:39
Right, my opinion is that we shouldn't be putting markup in docstrings.  They are (from our point of view) pure text.

I don't know if discussion on python-dev is warranted, it seems like a fairly uncontroversial change, since it brings the docstrings in question into compliance with our general practice in the majority of the stdlib.  Unless my impression about that is wrong :)

I don't have an opinion on multiple versus single PR for this.
msg319893 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2018-06-18 15:55
By the way, in case anyone is curious, I'm pretty sure that markup is left over from the days when tex/latex was what docs were *written* in.
msg319897 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-06-18 16:29
What is the problem? Docstrings are not written in the reStructuredText format in general. If Sphinx complains about docstrings in the stlib, don't run Sphinx for the stdlib files or report a Sphinx bug. Even if remove all `x', docstrings still will not be valid reStructuredText. For example the last three lines will be joined in a single paragraph for the docstring in msg319848.

`name' does not have special meaning in (La)TeX. AFAIK `name' is common writing of quotes in English texts. It predates reStructuredText and was widely used in Usenet and mailing lists (as two spaces after sentence-ending punctuation, double hyphen for a dash, _n_a_m_e_ to simulate underscoring, etc). It is just a part of old computer-typing culture (or may be even pre-computer).
msg319898 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2018-06-18 16:38
No, it is (somewhat) unique to tex.  If you write `word' tex would turn that into a pair of opposing quotes in the typeset document, as opposed to 'word' (the convention in regular text/emails/posts/etc) where you'd just see ascii quotes.  tex would render 'word' as a closing quote both before and after word, which looks weird in typeset text.

There's no bug here; as you say we aren't interested in making the docstrings parseable as restructured text (at least, I'm not).  For me, this is about getting rid of the now-odd-looking tex leftovers and making the ascii styling consistent with the bulk of our docstrings.

It's not a big deal, though.
msg319900 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-06-18 17:28
> AFAIK `name' is common writing of quotes in English texts

I don't remember ever seeing it before.  It looks like a typo to me, and I am sure it will to most readers.  I think it should be corrected as if it were a typo.
msg319901 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-06-18 18:05
Oh, you are right. I didn't write much English TeX, and used other types 
of quotes.

I'm sure that I seen `such quotation' in non-TeX files, but maybe my 
memory fools me. In any case using this writing here is likely an 
artifact of copying from a TeX documentation.
msg319903 - (view) Author: Patrick Lehmann (Patrick Lehmann) * Date: 2018-06-18 18:18
Having single quotes in docstrings is also ok for Sphinx documentation.

Btw. ReStructured text (docutils) was invented to document Python sources, why is it not used by Python?
msg319907 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-06-18 19:37
RestructuredText, DocUtils, and Sphinx were developed independently, by people other than the pydev/cpython group.  (The proposal to include DocUtils in the stdlib was rejected.)  We converted to .rst for the Python documentation sources files about a decade ago.  Sphinx turns them into the .html files you can see online.  RestructuredText markup can now also be used in PEP, news/changelog, and What's New sources.  These are things that most people only view online or in other processed forms, not in source form.

On the other hand, stdlib docstrings are mostly viewed unprocessed in .py sources or as output from help() in text consoles or text widgets.  So markup in stdlib docstrings would impact everyone, not just core developers.
msg319908 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-06-18 20:31
Marking this as closed.  Though well-intentioned, the suggestion is predicated on a misperception of the role of Sphinx for CPython and existing PEP recommendations on docstring practices.
msg319910 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2018-06-18 20:37
I would still like to see the legacy tex markup removed from the docstrings, so I think closing this issue is not appropriate, but it's not a big enough deal that I'll push for it if Raymond wants to keep it closed.
msg319914 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-06-18 21:04
I agree with David.  I would like the ugly markup changed independent of how Sphinx treats it.  I was thinking of changing the title to "Change obsolete tex markup in stdlib docstrings".  The intent of `word' was for people to see balanced quotes, which in ascii text means 'word'.
Date User Action Args
2018-06-18 21:04:13terry.reedysetmessages: + msg319914
2018-06-18 20:37:45r.david.murraysetmessages: + msg319910
2018-06-18 20:31:52rhettingersetstatus: open -> closed

nosy: + rhettinger
messages: + msg319908

resolution: not a bug
stage: patch review -> resolved
2018-06-18 19:37:03terry.reedysetmessages: + msg319907
2018-06-18 18:18:39Patrick Lehmannsetmessages: + msg319903
2018-06-18 18:05:56serhiy.storchakasetmessages: + msg319901
2018-06-18 17:28:07terry.reedysetmessages: + msg319900
2018-06-18 16:38:34r.david.murraysetmessages: + msg319898
2018-06-18 16:29:59serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg319897
2018-06-18 15:55:05r.david.murraysetmessages: + msg319893
2018-06-18 15:39:04r.david.murraysetmessages: + msg319892
2018-06-18 01:52:45rhettingersettitle: Sphinx incompatible markup in the standard library -> Sphinx incompatible markup in the standard library docstrings
2018-06-18 01:45:52terry.reedysetnosy: - rhettinger
messages: + msg319848
2018-06-18 01:39:28rhettingersetnosy: + rhettinger
messages: + msg319847
2018-06-18 01:08:30terry.reedysetmessages: + msg319845
2018-06-18 00:41:34Patrick Lehmannsetmessages: + msg319839
2018-06-18 00:34:00terry.reedysetmessages: + msg319838
2018-06-17 23:53:09r.david.murraysetmessages: + msg319837
2018-06-17 22:23:28Patrick Lehmannsetmessages: + msg319833
2016-12-28 23:19:12lukasz.langasetmessages: + msg284206
title: Sphinx incompatible markup in configparser.ConfigParser. -> Sphinx incompatible markup in the standard library
2016-11-20 12:24:18serhiy.storchakasetnosy: + georg.brandl
2016-11-20 11:10:24Patrick Lehmannsetmessages: + msg281261
2016-11-19 03:39:44Patrick Lehmannsetmessages: + msg281198
2016-11-19 03:30:48Patrick Lehmannsetmessages: + msg281197
2016-11-18 20:36:58r.david.murraysetmessages: + msg281165
2016-11-18 20:16:48terry.reedysetnosy: + terry.reedy

messages: + msg281162
stage: needs patch -> patch review
2016-11-16 23:23:34Patrick Lehmannsetfiles: + docstring_markup.patch
keywords: + patch
messages: + msg281010
2016-11-16 23:00:50r.david.murraysetnosy: + r.david.murray, lukasz.langa
messages: + msg281007
2016-11-16 22:50:02Patrick Lehmannsetmessages: + msg281005
2016-11-16 06:53:54serhiy.storchakasetkeywords: + easy
stage: needs patch
versions: + Python 2.7, Python 3.6, Python 3.7, - Python 3.4
2016-11-16 00:59:42Patrick Lehmanncreate