This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Fix and update string/byte literals in help()
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.8, Python 3.7, Python 3.6, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: adelfino, eric.smith, miss-islington, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2018-05-03 23:37 by adelfino, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 6701 merged adelfino, 2018-05-03 23:37
PR 6709 merged miss-islington, 2018-05-05 16:07
PR 6710 merged miss-islington, 2018-05-05 16:08
PR 6712 merged adelfino, 2018-05-05 18:39
Messages (16)
msg316148 - (view) Author: Andrés Delfino (adelfino) * (Python triager) Date: 2018-05-03 23:37
Right now, for string/byte literals help() shows, for example:

No Python documentation found for 'r'.
Use help() to get the interactive help utility.
Use help(str) for help on the str class.

PR fixes the quotation mark removal and updates the list with f and u literals, while also adding uppercase versions of all literals. While the list is install incomplete (e.g., fR and the others could be listed) I believe that's too much.
msg316149 - (view) Author: Andrés Delfino (adelfino) * (Python triager) Date: 2018-05-03 23:40
*While the list is still incomplete
msg316162 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-04 10:04
Interesting, I didn't know that pydoc supports this.

Specifying all possible prefixes is cumbersome and errorprone. The number of combinations grows exponentially with adding new letters. I suggest either to specify only lower-case variants and generate all variants with upper-case letters (as it done in the tokenize module) or always calls the lower() method when look up in the symbols dictionary.

It may be worth to add a special topic for f-strings.
msg316164 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-05-04 10:43
It's not clear to me what you're typing to get the output in the first message. Can you clarify? Is this at the interactive prompt?
msg316175 - (view) Author: Andrés Delfino (adelfino) * (Python triager) Date: 2018-05-04 13:33
Eric, I entered "r'" in the interactive prompt.

Serhiy, using the code in tokenize, I got a total of 144 combinations. For comparison, the list of symbols help() shows, after the proposed change, has 67 items.

IMHO, we should compromise. Maybe just mentioning the letters? With no quoting character. I do know that this can be confusing, since J is shown as a letter but it's used as a suffix unlike b/f/r/u...

What do you think?
msg316176 - (view) Author: Andrés Delfino (adelfino) * (Python triager) Date: 2018-05-04 13:37
To get the 144 combinations I used the logic in tokenize.py:

import re

def _combinations(*l):
    return set(
        x + y for x in l for y in l + ("",) if x.casefold() != y.casefold()
    )

_strprefixes = (
    _combinations('r', 'R', 'f', 'F') | _combinations('r', 'R', 'b', 'B') | {'u', 'U', 'ur', 'uR', 'Ur', 'UR'}
)

triple_quoted = (
    {"'''", '"""'} | {f"{prefix}'''" for prefix in _strprefixes} | {f'{prefix}"""' for prefix in _strprefixes}
)
single_quoted = (
    {"'", '"'} | {f"{prefix}'" for prefix in _strprefixes} | {f'{prefix}"' for prefix in _strprefixes}
)

all_combinations = _strprefixes | single_quoted | triple_quoted

print(' '.join(list(all_combinations)))

print(len(all_combinations))
msg316177 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-04 15:25
I don't think we need to support prefixes without quotes or with triple qoutes. 'ur' is not valid prefix. Using simplified code from tokenize:

        _strprefixes = [''.join(u) + q
                        for t in ('b', 'r', 'u', 'f', 'br', 'rb', 'fr', 'rf')
                        for u in itertools.product(*[(c, c.upper()) for c in t])
                        for q in ("'", '"')]

Or you can use tokenize._all_string_prefixes() directly:

        _strprefixes = [p + q
                        for p in tokenize._all_string_prefixes()
                        for q in ("'", '"')]

But it may be simple to just convert the string to lower case before looking up in the symbols dict. Then

        _strprefixes = [p + q
                        for p in ('b', 'r', 'u', 'f', 'br', 'rb', 'fr', 'rf')
                        for q in ("'", '"')]
msg316178 - (view) Author: Andrés Delfino (adelfino) * (Python triager) Date: 2018-05-04 15:41
And what should symbols show in pydoc?

Should symbols show:

1. All legal combinations with ("'", '"') (48 possible combinations)
2. Only b/f/r/u with ("'", '"') (IMHO, this is the most reasonable option) 
3. Only b/f/r/u with ' or "

Depending on that, we can choose one of the options you mentioned.
msg316179 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-04 16:06
Option 2 LGTM.
msg316182 - (view) Author: Andrés Delfino (adelfino) * (Python triager) Date: 2018-05-04 17:57
I have updated the PR. Now symbols show:

Here is a list of the punctuation symbols which Python assigns special meaning
to. Enter any symbol to get more help.

!=                  +                   <=                  __
"                   +=                  <>                  `
"""                 ,                   ==                  b"
%                   -                   >                   b'
%=                  -=                  >=                  f"
&                   .                   >>                  f'
&=                  ...                 >>=                 j
'                   /                   @                   r"
'''                 //                  J                   r'
(                   //=                 [                   u"
)                   /=                  \                   u'
*                   :                   ]                   |
**                  <                   ^                   |=
**=                 <<                  ^=                  ~
*=                  <<=                 _                   

I don't understand how topics.py gets autogenerated by Sphinx (by te way, it says it was generated on January 2018), so I'm having trouble with using a more specific topic for f'.
msg316217 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-05 16:07
New changeset b2043bbe6034b53f5ad337887f4741b74b70b00d by Serhiy Storchaka (Andrés Delfino) in branch 'master':
bpo-33422: Fix quotation marks getting deleted when looking up byte/string literals on pydoc. (GH-6701)
https://github.com/python/cpython/commit/b2043bbe6034b53f5ad337887f4741b74b70b00d
msg316218 - (view) Author: miss-islington (miss-islington) Date: 2018-05-05 16:42
New changeset 351782b9927c610ff531100dbdcbbd19d91940a3 by Miss Islington (bot) in branch '3.7':
bpo-33422: Fix quotation marks getting deleted when looking up byte/string literals on pydoc. (GH-6701)
https://github.com/python/cpython/commit/351782b9927c610ff531100dbdcbbd19d91940a3
msg316219 - (view) Author: miss-islington (miss-islington) Date: 2018-05-05 17:12
New changeset 0ba812b1bee65a6cad16f153a7f5074bc271e0e5 by Miss Islington (bot) in branch '3.6':
bpo-33422: Fix quotation marks getting deleted when looking up byte/string literals on pydoc. (GH-6701)
https://github.com/python/cpython/commit/0ba812b1bee65a6cad16f153a7f5074bc271e0e5
msg316253 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-07 05:44
New changeset c40eeeb5e69df12a5f46edc7ba82ec75c7d1b820 by Serhiy Storchaka (Andrés Delfino) in branch '2.7':
[2.7] bpo-33422: Fix quotation marks getting deleted when looking up byte/string literals on pydoc. (GH-6701) (GH-6712)
https://github.com/python/cpython/commit/c40eeeb5e69df12a5f46edc7ba82ec75c7d1b820
msg316254 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-07 05:47
Thank you for your contribution Andrés!

I just wondering how did you discover this bug?
msg316262 - (view) Author: Andrés Delfino (adelfino) * (Python triager) Date: 2018-05-07 12:13
I was exploring pydoc, saw the literals as available help "terms" on the sysmbols section, and tried reading the help for one of them.

I learnt about IDLE debugging to trace this bug :)
History
Date User Action Args
2022-04-11 14:59:00adminsetgithub: 77603
2018-05-07 12:13:53adelfinosetmessages: + msg316262
2018-05-07 05:47:57serhiy.storchakasetstatus: open -> closed
versions: + Python 2.7, Python 3.6, Python 3.7
type: behavior
messages: + msg316254

resolution: fixed
stage: patch review -> resolved
2018-05-07 05:44:08serhiy.storchakasetmessages: + msg316253
2018-05-05 18:39:47adelfinosetpull_requests: + pull_request6405
2018-05-05 17:12:21miss-islingtonsetmessages: + msg316219
2018-05-05 16:42:59miss-islingtonsetnosy: + miss-islington
messages: + msg316218
2018-05-05 16:08:42miss-islingtonsetpull_requests: + pull_request6403
2018-05-05 16:07:45miss-islingtonsetpull_requests: + pull_request6402
2018-05-05 16:07:34serhiy.storchakasetmessages: + msg316217
2018-05-04 17:57:02adelfinosetmessages: + msg316182
2018-05-04 16:06:50serhiy.storchakasetmessages: + msg316179
2018-05-04 15:41:06adelfinosetmessages: + msg316178
2018-05-04 15:25:19serhiy.storchakasetmessages: + msg316177
2018-05-04 13:37:35adelfinosetmessages: + msg316176
2018-05-04 13:33:24adelfinosetmessages: + msg316175
2018-05-04 10:43:14eric.smithsetmessages: + msg316164
2018-05-04 10:04:28serhiy.storchakasetnosy: + eric.smith, serhiy.storchaka
messages: + msg316162
2018-05-03 23:40:26adelfinosetmessages: + msg316149
2018-05-03 23:37:49adelfinosetkeywords: + patch
stage: patch review
pull_requests: + pull_request6393
2018-05-03 23:37:01adelfinocreate