This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: IDLE: highlight soft keywords
Type: behavior Stage: resolved
Components: IDLE Versions: Python 3.11, Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: terry.reedy Nosy List: epaine, kj, miss-islington, rffontenelle, taleinat, terry.reedy
Priority: normal Keywords: patch

Created on 2021-05-02 15:18 by epaine, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 25851 merged taleinat, 2021-05-03 13:32
PR 26237 merged miss-islington, 2021-05-19 09:18
PR 29454 merged rffontenelle, 2021-11-06 22:58
Messages (14)
msg392705 - (view) Author: E. Paine (epaine) * Date: 2021-05-02 15:18
As-per PEP 634, structural pattern matching is now in Python. This introduces the `match` and `case` keywords. IDLE does not highlight these.

The problem is that these are listed in `keyword.softkwlist` rather than `keyword.kwlist` (which is what IDLE uses). This confuses me, as this is not a __future__ feature and there is no discussion of it becoming one in #42128. There is also no discussion (that I could find) about which list it should be put in. The addition to softkwlist was done in PR-22917.

Do we change IDLE to use softkwlist, or move those keywords into kwlist?
msg392707 - (view) Author: Ken Jin (kj) * (Python committer) Date: 2021-05-02 15:53
Hi, I'm no IDLE expert, but I think moving the new soft keywords into kwlist seems wrong:

Soft keywords were added in Python 3.9 when the PEG parser became the default. The keyword list was also updated accordingly https://docs.python.org/3/library/keyword.html#keyword.softkwlist.

This link provides an explanation of how soft keywords differ from normal keywords:  https://docs.python.org/3.10/reference/lexical_analysis.html#soft-keywords

Thanks
msg392708 - (view) Author: E. Paine (epaine) * Date: 2021-05-02 16:08
Thanks for linking to the Lexical Analysis docs. Not quite sure how I missed this given it is directly below the normal keywords section. Given the distinction described there, it may instead be best for IDLE to highlight this as its own category (i.e. not grouping it with the standard keywords).
msg392713 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-05-02 18:05
Soft keywords are a huge nuisance for syntax highlighting as they need special case regexes and tests.

Hard keywords are matched against complete words, regardless of whether the context is syntactically valid or not.  If 'for' and 'else' were the only keywords, the keyword part of the IDLE colorizer regex would be as follows.

>>> kw = r"\b" + colorizer.any("KEYWORD", ['for', 'else']) + r"\b"
>>> kw
'\\b(?P<KEYWORD>for|else)\\b'

Both words in 'for.else' are highlighted as the tokenizer will see them as keywords.  The parser will later see the combination as an error.

The tag name in a "(?P<name>...) construct can only be used once in a regex.  Since the word-boundary context is the same for all hard keywords, the alternation can be done within one such context and all (hard) keywords get the same match tag (dict key "KEYWORD"), making it easy to give all the same highlight.

Soft keywords need different contexts to avoid false positives.  'match' and 'case' must be the first non-blank on a line and followed by ':'.  '_' must follow 'case' and space. I believe each context will have to have its own tag name, so multiple keyword tags must be mapped to 'keyword'.  

skw1 = r"^[ \t]*(?P<SKEY1>match|case)[ \t]+:"
skw2 = r"case[ \t]+(?P<SKEY1>_)\b"

Add skw1 and skw2 to the prog definition, which should use "|".join(...).

In ColorDelegator.LoadTagDefs (which should be renamed), replace

            "KEYWORD": idleConf.GetHighlight(theme, "keyword"),

with
            "KEYWORD": keydef
            "SKEY1": keydef
            "SKEY2": keydef

after first defining keydef with

        keydef = idleConf.GetHighlight(theme, "keyword")

Some new tests will be needed.
msg392788 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-03 10:14
Terry, Elisha, does one of you intend to work on this? If not, I'd be willing to.
msg392791 - (view) Author: E. Paine (epaine) * Date: 2021-05-03 10:25
I don't mind, would you like to Tal? (I probably won't be able to dedicate any serious time to it until mid-June). One thing I've been thinking is whether it's worth us highlighting regardless of context. For example, you can assign a variable to a builtin name (not that it's recommended) so we could just give soft keywords their own colour and (unofficially) recommend people don't use such words for variables.

I think this would be more future-proof as we wouldn't need to update the regexes for each new soft keyword added. However, we might not want to highlight every time the user has an '_' variable (as is fairly common).
msg392792 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-03 10:41
I think it is rather crucial to have this with the 3.10 release. I'll try to get this working ASAP.

I agree that a simple "good enough" solution could be a good start, but "_" will likely need special handling.
msg392793 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-05-03 10:42
My plan for the next day or two is to submit followup issue for Shell and formally code what I wrote.

The only way to handle soft keywords correctly is with a custom re.  I don't expect them to become common.  They are different from builtins because they only have special meaning in (so far) definable situations.  When  builtin is 'redefined, it may or may not be appropriate to keep the highlight.  Examples when it is:

oldprint = print
def print(*args, **kwds:
    log the print
    oldprint(*args, **kwds)

def intsum(nums, int=int):  # Localize int for speed.
    <code that calls int multiple times>
msg392794 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-05-03 10:49
I agree with getting this in soon.

A related request is to to syntax highlight field expressions in f strings.  I don't think there is an existing issue.  Apparently, at least some alternatives to IDLE do this.  I am not sure I would really want it, but we need at least some mockups.  Tal, what do you think and are you interested in trying to write a PR?
msg392796 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-03 10:51
> A related request is to to syntax highlight field expressions in f strings.

Related, but separate, and IMO not quite as urgent.

I can commit to working on this issue (soft keywords), but I'll have to see where things stand once this is finished before moving on to f-strings.
msg392807 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-03 13:33
I've created a PR (GH-25851) with a rather quick, working implementation.

This includes some tests but I haven't thoroughly tested it yet.

If anyone can take a look and give feedback on the approach, that would be great.
msg393940 - (view) Author: miss-islington (miss-islington) Date: 2021-05-19 09:44
New changeset 3357604db966693b752cbd9ffc3ad0f40b970d31 by Miss Islington (bot) in branch '3.10':
bpo-44010: IDLE: colorize pattern-matching soft keywords (GH-25851)
https://github.com/python/cpython/commit/3357604db966693b752cbd9ffc3ad0f40b970d31
msg394366 - (view) Author: E. Paine (epaine) * Date: 2021-05-25 15:00
Can we close this, or are we leaving it open for when (if) we do a colouriser refactor?
msg394370 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2021-05-25 15:20
IMO this can be closed, so I'm closing this.  (Terry is welcome to reopen this if needed.)
History
Date User Action Args
2022-04-11 14:59:45adminsetgithub: 88176
2021-11-06 22:58:18rffontenellesetnosy: + rffontenelle

pull_requests: + pull_request27709
2021-05-25 15:20:03taleinatsetstatus: open -> closed
resolution: fixed
messages: + msg394370

stage: patch review -> resolved
2021-05-25 15:00:37epainesetmessages: + msg394366
2021-05-19 09:44:17miss-islingtonsetmessages: + msg393940
2021-05-19 09:18:18miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request24854
2021-05-03 13:33:25taleinatsetmessages: + msg392807
2021-05-03 13:32:00taleinatsetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request24534
2021-05-03 10:51:53taleinatsetmessages: + msg392796
2021-05-03 10:49:55terry.reedysetmessages: + msg392794
2021-05-03 10:42:41terry.reedysetmessages: + msg392793
2021-05-03 10:41:08taleinatsetmessages: + msg392792
2021-05-03 10:25:14epainesetmessages: + msg392791
2021-05-03 10:14:15taleinatsetmessages: + msg392788
2021-05-02 18:05:45terry.reedysetmessages: + msg392713
stage: test needed
2021-05-02 16:08:49epainesetmessages: + msg392708
title: IDLE: highlight new `match` / `case` syntax -> IDLE: highlight soft keywords
2021-05-02 15:53:38kjsetnosy: + kj
messages: + msg392707
2021-05-02 15:18:46epainecreate