Fix and update string/byte literals in help() #77603

andresdelfino · 2018-05-03T23:37:02Z

BPO	33422
Nosy	@ericvsmith, @serhiy-storchaka, @andresdelfino, @miss-islington
PRs	bpo-33422: Consider the special case of string/byte literals, and update the lis… #6701 [3.7] bpo-33422: Fix quotation marks getting deleted when looking up byte/string literals on pydoc. (GH-6701) #6709 [3.6] bpo-33422: Fix quotation marks getting deleted when looking up byte/string literals on pydoc. (GH-6701) #6710 [2.7] bpo-33422: Fix quotation marks getting deleted when looking up … #6712

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2018-05-07.05:47:57.267>
created_at = <Date 2018-05-03.23:37:01.734>
labels = ['3.7', '3.8', 'type-bug', 'library']
title = 'Fix and update string/byte literals in help()'
updated_at = <Date 2018-05-07.12:13:53.324>
user = 'https://github.com/andresdelfino'

bugs.python.org fields:

activity = <Date 2018-05-07.12:13:53.324>
actor = 'adelfino'
assignee = 'none'
closed = True
closed_date = <Date 2018-05-07.05:47:57.267>
closer = 'serhiy.storchaka'
components = ['Library (Lib)']
creation = <Date 2018-05-03.23:37:01.734>
creator = 'adelfino'
dependencies = []
files = []
hgrepos = []
issue_num = 33422
keywords = ['patch']
message_count = 16.0
messages = ['316148', '316149', '316162', '316164', '316175', '316176', '316177', '316178', '316179', '316182', '316217', '316218', '316219', '316253', '316254', '316262']
nosy_count = 4.0
nosy_names = ['eric.smith', 'serhiy.storchaka', 'adelfino', 'miss-islington']
pr_nums = ['6701', '6709', '6710', '6712']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue33422'
versions = ['Python 2.7', 'Python 3.6', 'Python 3.7', 'Python 3.8']

andresdelfino · 2018-05-03T23:37:02Z

Right now, for string/byte literals help() shows, for example:

No Python documentation found for 'r'.
Use help() to get the interactive help utility.
Use help(str) for help on the str class.

PR fixes the quotation mark removal and updates the list with f and u literals, while also adding uppercase versions of all literals. While the list is install incomplete (e.g., fR and the others could be listed) I believe that's too much.

andresdelfino · 2018-05-03T23:40:26Z

*While the list is still incomplete

serhiy-storchaka · 2018-05-04T10:04:28Z

Interesting, I didn't know that pydoc supports this.

Specifying all possible prefixes is cumbersome and errorprone. The number of combinations grows exponentially with adding new letters. I suggest either to specify only lower-case variants and generate all variants with upper-case letters (as it done in the tokenize module) or always calls the lower() method when look up in the symbols dictionary.

It may be worth to add a special topic for f-strings.

ericvsmith · 2018-05-04T10:43:14Z

It's not clear to me what you're typing to get the output in the first message. Can you clarify? Is this at the interactive prompt?

andresdelfino · 2018-05-04T13:33:24Z

Eric, I entered "r'" in the interactive prompt.

Serhiy, using the code in tokenize, I got a total of 144 combinations. For comparison, the list of symbols help() shows, after the proposed change, has 67 items.

IMHO, we should compromise. Maybe just mentioning the letters? With no quoting character. I do know that this can be confusing, since J is shown as a letter but it's used as a suffix unlike b/f/r/u...

What do you think?

andresdelfino · 2018-05-04T13:37:35Z

To get the 144 combinations I used the logic in tokenize.py:

import re

def _combinations(*l):
    return set(
        x + y for x in l for y in l + ("",) if x.casefold() != y.casefold()
    )

_strprefixes = (
    _combinations('r', 'R', 'f', 'F') | _combinations('r', 'R', 'b', 'B') | {'u', 'U', 'ur', 'uR', 'Ur', 'UR'}
)

triple_quoted = (
    {"'''", '"""'} | {f"{prefix}'''" for prefix in _strprefixes} | {f'{prefix}"""' for prefix in _strprefixes}
)
single_quoted = (
    {"'", '"'} | {f"{prefix}'" for prefix in _strprefixes} | {f'{prefix}"' for prefix in _strprefixes}
)

all_combinations = _strprefixes | single_quoted | triple_quoted

print(' '.join(list(all_combinations)))

print(len(all_combinations))

serhiy-storchaka · 2018-05-04T15:25:20Z

I don't think we need to support prefixes without quotes or with triple qoutes. 'ur' is not valid prefix. Using simplified code from tokenize:

        _strprefixes = [''.join(u) + q
                        for t in ('b', 'r', 'u', 'f', 'br', 'rb', 'fr', 'rf')
                        for u in itertools.product(*[(c, c.upper()) for c in t])
                        for q in ("'", '"')]

Or you can use tokenize._all_string_prefixes() directly:

        _strprefixes = [p + q
                        for p in tokenize._all_string_prefixes()
                        for q in ("'", '"')]

But it may be simple to just convert the string to lower case before looking up in the symbols dict. Then

        _strprefixes = [p + q
                        for p in ('b', 'r', 'u', 'f', 'br', 'rb', 'fr', 'rf')
                        for q in ("'", '"')]

andresdelfino · 2018-05-04T15:41:06Z

And what should symbols show in pydoc?

Should symbols show:

All legal combinations with ("'", '"') (48 possible combinations)
Only b/f/r/u with ("'", '"') (IMHO, this is the most reasonable option)
Only b/f/r/u with ' or "

Depending on that, we can choose one of the options you mentioned.

serhiy-storchaka · 2018-05-04T16:06:50Z

Option 2 LGTM.

andresdelfino · 2018-05-04T17:57:02Z

I have updated the PR. Now symbols show:

Here is a list of the punctuation symbols which Python assigns special meaning
to. Enter any symbol to get more help.

!= + <= __
" += <> `
""" , == b"
% - > b'
%= -= >= f"
& . >> f'
&= ... >>= j
' / @ r"
''' // J r'
( //= [ u"
) /= \ u'

              :                   ]                   |

** < ^ |=
**= << ^= ~
*= <<= _

I don't understand how topics.py gets autogenerated by Sphinx (by te way, it says it was generated on January 2018), so I'm having trouble with using a more specific topic for f'.

serhiy-storchaka · 2018-05-05T16:07:35Z

New changeset b2043bb by Serhiy Storchaka (Andrés Delfino) in branch 'master':
bpo-33422: Fix quotation marks getting deleted when looking up byte/string literals on pydoc. (GH-6701)
b2043bb

miss-islington · 2018-05-05T16:42:59Z

New changeset 351782b by Miss Islington (bot) in branch '3.7':
bpo-33422: Fix quotation marks getting deleted when looking up byte/string literals on pydoc. (GH-6701)
351782b

miss-islington · 2018-05-05T17:12:21Z

New changeset 0ba812b by Miss Islington (bot) in branch '3.6':
bpo-33422: Fix quotation marks getting deleted when looking up byte/string literals on pydoc. (GH-6701)
0ba812b

serhiy-storchaka · 2018-05-07T05:44:08Z

New changeset c40eeeb by Serhiy Storchaka (Andrés Delfino) in branch '2.7':
[2.7] bpo-33422: Fix quotation marks getting deleted when looking up byte/string literals on pydoc. (GH-6701) (GH-6712)
c40eeeb

serhiy-storchaka · 2018-05-07T05:47:56Z

Thank you for your contribution Andrés!

I just wondering how did you discover this bug?

andresdelfino · 2018-05-07T12:13:53Z

I was exploring pydoc, saw the literals as available help "terms" on the sysmbols section, and tried reading the help for one of them.

I learnt about IDLE debugging to trace this bug :)

andresdelfino added 3.8 only security fixes stdlib Python modules in the Lib dir labels May 3, 2018

serhiy-storchaka added the 3.7 (EOL) end of life label May 7, 2018

serhiy-storchaka closed this as completed May 7, 2018

serhiy-storchaka added the type-bug An unexpected behavior, bug, or error label May 7, 2018

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix and update string/byte literals in help() #77603

Fix and update string/byte literals in help() #77603

andresdelfino commented May 3, 2018

andresdelfino commented May 3, 2018

andresdelfino commented May 3, 2018

serhiy-storchaka commented May 4, 2018

ericvsmith commented May 4, 2018

andresdelfino commented May 4, 2018

andresdelfino commented May 4, 2018

serhiy-storchaka commented May 4, 2018

andresdelfino commented May 4, 2018

serhiy-storchaka commented May 4, 2018

andresdelfino commented May 4, 2018

serhiy-storchaka commented May 5, 2018

miss-islington commented May 5, 2018

miss-islington commented May 5, 2018

serhiy-storchaka commented May 7, 2018

serhiy-storchaka commented May 7, 2018

andresdelfino commented May 7, 2018

Fix and update string/byte literals in help() #77603

Fix and update string/byte literals in help() #77603

Comments

andresdelfino commented May 3, 2018

andresdelfino commented May 3, 2018

andresdelfino commented May 3, 2018

serhiy-storchaka commented May 4, 2018

ericvsmith commented May 4, 2018

andresdelfino commented May 4, 2018

andresdelfino commented May 4, 2018

serhiy-storchaka commented May 4, 2018

andresdelfino commented May 4, 2018

serhiy-storchaka commented May 4, 2018

andresdelfino commented May 4, 2018

serhiy-storchaka commented May 5, 2018

miss-islington commented May 5, 2018

miss-islington commented May 5, 2018

serhiy-storchaka commented May 7, 2018

serhiy-storchaka commented May 7, 2018

andresdelfino commented May 7, 2018