Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDLE: highlight soft keywords #88176

Closed
E-Paine mannequin opened this issue May 2, 2021 · 14 comments
Closed

IDLE: highlight soft keywords #88176

E-Paine mannequin opened this issue May 2, 2021 · 14 comments
Assignees
Labels
3.10 only security fixes 3.11 only security fixes topic-IDLE type-bug An unexpected behavior, bug, or error

Comments

@E-Paine
Copy link
Mannequin

E-Paine mannequin commented May 2, 2021

BPO 44010
Nosy @terryjreedy, @taleinat, @miss-islington, @rffontenelle, @E-Paine, @Fidget-Spinner
PRs
  • bpo-44010: IDLE: colorize pattern-matching soft keywords #25851
  • [3.10] bpo-44010: IDLE: colorize pattern-matching soft keywords (GH-25851) #26237
  • doc: Add link issue 44010 #29454
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/terryjreedy'
    closed_at = <Date 2021-05-25.15:20:03.526>
    created_at = <Date 2021-05-02.15:18:46.419>
    labels = ['expert-IDLE', 'type-bug', '3.10', '3.11']
    title = 'IDLE: highlight soft keywords'
    updated_at = <Date 2021-11-06.22:58:18.754>
    user = 'https://github.com/E-Paine'

    bugs.python.org fields:

    activity = <Date 2021-11-06.22:58:18.754>
    actor = 'rffontenelle'
    assignee = 'terry.reedy'
    closed = True
    closed_date = <Date 2021-05-25.15:20:03.526>
    closer = 'taleinat'
    components = ['IDLE']
    creation = <Date 2021-05-02.15:18:46.419>
    creator = 'epaine'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44010
    keywords = ['patch']
    message_count = 14.0
    messages = ['392705', '392707', '392708', '392713', '392788', '392791', '392792', '392793', '392794', '392796', '392807', '393940', '394366', '394370']
    nosy_count = 6.0
    nosy_names = ['terry.reedy', 'taleinat', 'miss-islington', 'rffontenelle', 'epaine', 'kj']
    pr_nums = ['25851', '26237', '29454']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue44010'
    versions = ['Python 3.10', 'Python 3.11']

    @E-Paine
    Copy link
    Mannequin Author

    E-Paine mannequin commented May 2, 2021

    As-per PEP-634, structural pattern matching is now in Python. This introduces the match and case keywords. IDLE does not highlight these.

    The problem is that these are listed in keyword.softkwlist rather than keyword.kwlist (which is what IDLE uses). This confuses me, as this is not a future feature and there is no discussion of it becoming one in bpo-42128. There is also no discussion (that I could find) about which list it should be put in. The addition to softkwlist was done in PR-22917.

    Do we change IDLE to use softkwlist, or move those keywords into kwlist?

    @E-Paine E-Paine mannequin added 3.10 only security fixes 3.11 only security fixes labels May 2, 2021
    @E-Paine E-Paine mannequin assigned terryjreedy May 2, 2021
    @E-Paine E-Paine mannequin added topic-IDLE type-bug An unexpected behavior, bug, or error labels May 2, 2021
    @Fidget-Spinner
    Copy link
    Member

    Hi, I'm no IDLE expert, but I think moving the new soft keywords into kwlist seems wrong:

    Soft keywords were added in Python 3.9 when the PEG parser became the default. The keyword list was also updated accordingly https://docs.python.org/3/library/keyword.html#keyword.softkwlist.

    This link provides an explanation of how soft keywords differ from normal keywords: https://docs.python.org/3.10/reference/lexical_analysis.html#soft-keywords

    Thanks

    @E-Paine
    Copy link
    Mannequin Author

    E-Paine mannequin commented May 2, 2021

    Thanks for linking to the Lexical Analysis docs. Not quite sure how I missed this given it is directly below the normal keywords section. Given the distinction described there, it may instead be best for IDLE to highlight this as its own category (i.e. not grouping it with the standard keywords).

    @E-Paine E-Paine mannequin changed the title IDLE: highlight new match / case syntax IDLE: highlight soft keywords May 2, 2021
    @terryjreedy
    Copy link
    Member

    Soft keywords are a huge nuisance for syntax highlighting as they need special case regexes and tests.

    Hard keywords are matched against complete words, regardless of whether the context is syntactically valid or not. If 'for' and 'else' were the only keywords, the keyword part of the IDLE colorizer regex would be as follows.

    >>> kw = r"\b" + colorizer.any("KEYWORD", ['for', 'else']) + r"\b"
    >>> kw
    '\\b(?P<KEYWORD>for|else)\\b'

    Both words in 'for.else' are highlighted as the tokenizer will see them as keywords. The parser will later see the combination as an error.

    The tag name in a "(?P<name>...) construct can only be used once in a regex. Since the word-boundary context is the same for all hard keywords, the alternation can be done within one such context and all (hard) keywords get the same match tag (dict key "KEYWORD"), making it easy to give all the same highlight.

    Soft keywords need different contexts to avoid false positives. 'match' and 'case' must be the first non-blank on a line and followed by ':'. '_' must follow 'case' and space. I believe each context will have to have its own tag name, so multiple keyword tags must be mapped to 'keyword'.

    skw1 = r"^[ \t]*(?P<SKEY1>match|case)[ \t]+:"
    skw2 = r"case[ \t]+(?P<SKEY1>_)\b"

    Add skw1 and skw2 to the prog definition, which should use "|".join(...).

    In ColorDelegator.LoadTagDefs (which should be renamed), replace

            "KEYWORD": idleConf.GetHighlight(theme, "keyword"),
    
    with
                "KEYWORD": keydef
                "SKEY1": keydef
                "SKEY2": keydef

    after first defining keydef with

            keydef = idleConf.GetHighlight(theme, "keyword")

    Some new tests will be needed.

    @taleinat
    Copy link
    Contributor

    taleinat commented May 3, 2021

    Terry, Elisha, does one of you intend to work on this? If not, I'd be willing to.

    @E-Paine
    Copy link
    Mannequin Author

    E-Paine mannequin commented May 3, 2021

    I don't mind, would you like to Tal? (I probably won't be able to dedicate any serious time to it until mid-June). One thing I've been thinking is whether it's worth us highlighting regardless of context. For example, you can assign a variable to a builtin name (not that it's recommended) so we could just give soft keywords their own colour and (unofficially) recommend people don't use such words for variables.

    I think this would be more future-proof as we wouldn't need to update the regexes for each new soft keyword added. However, we might not want to highlight every time the user has an '_' variable (as is fairly common).

    @taleinat
    Copy link
    Contributor

    taleinat commented May 3, 2021

    I think it is rather crucial to have this with the 3.10 release. I'll try to get this working ASAP.

    I agree that a simple "good enough" solution could be a good start, but "_" will likely need special handling.

    @terryjreedy
    Copy link
    Member

    My plan for the next day or two is to submit followup issue for Shell and formally code what I wrote.

    The only way to handle soft keywords correctly is with a custom re. I don't expect them to become common. They are different from builtins because they only have special meaning in (so far) definable situations. When builtin is 'redefined, it may or may not be appropriate to keep the highlight. Examples when it is:

    oldprint = print
    def print(*args, **kwds:
        log the print
        oldprint(*args, **kwds)
    
    def intsum(nums, int=int):  # Localize int for speed.
        <code that calls int multiple times>

    @terryjreedy
    Copy link
    Member

    I agree with getting this in soon.

    A related request is to to syntax highlight field expressions in f strings. I don't think there is an existing issue. Apparently, at least some alternatives to IDLE do this. I am not sure I would really want it, but we need at least some mockups. Tal, what do you think and are you interested in trying to write a PR?

    @taleinat
    Copy link
    Contributor

    taleinat commented May 3, 2021

    A related request is to to syntax highlight field expressions in f strings.

    Related, but separate, and IMO not quite as urgent.

    I can commit to working on this issue (soft keywords), but I'll have to see where things stand once this is finished before moving on to f-strings.

    @taleinat
    Copy link
    Contributor

    taleinat commented May 3, 2021

    I've created a PR (GH-25851) with a rather quick, working implementation.

    This includes some tests but I haven't thoroughly tested it yet.

    If anyone can take a look and give feedback on the approach, that would be great.

    @miss-islington
    Copy link
    Contributor

    New changeset 3357604 by Miss Islington (bot) in branch '3.10':
    bpo-44010: IDLE: colorize pattern-matching soft keywords (GH-25851)
    3357604

    @E-Paine
    Copy link
    Mannequin Author

    E-Paine mannequin commented May 25, 2021

    Can we close this, or are we leaving it open for when (if) we do a colouriser refactor?

    @taleinat
    Copy link
    Contributor

    IMO this can be closed, so I'm closing this. (Terry is welcome to reopen this if needed.)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes 3.11 only security fixes topic-IDLE type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants