This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author terry.reedy
Recipients serhiy.storchaka, terry.reedy
Date 2018-04-30.07:06:37
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1525071997.99.0.682650639539.issue21474@psf.upfronthosting.co.za>
In-reply-to
Content
Now-closed duplicate #33386 reported that μ, 0x3bc, is not selected as part of identifiers when double clicking.  This prompted some research.

The 'Windows' style imitates the behavior of Command Prompt, which I presume is a carryover from DOS days.  PowerShell stuck with it, but Notepad, Notepad++, Microsoft Word, Firefox, Thunderbird, and ??? have not. I think Tcl should have switched long ago.  In any case, I will go with whatever the tcl re engine defines as word chars, the 'Motif' style', rather than attempt to write a giant re, which would have to change as characters are added.

Do we still need this line in fixwordbreaks?
    tk.call('tcl_wordBreakAfter', 'a b', 0) # make sure word.tcl is loaded
I will leave it until you say we don't.

After patching, 'abcμμμdef' ('0x3bc'*3) is selected as one word instead of word, nonword, word.  'abc+efg' is still selected in 3 pieces, instead of the 1 word seen by Command Prompt.
History
Date User Action Args
2018-04-30 07:06:38terry.reedysetrecipients: + terry.reedy, serhiy.storchaka
2018-04-30 07:06:37terry.reedysetmessageid: <1525071997.99.0.682650639539.issue21474@psf.upfronthosting.co.za>
2018-04-30 07:06:37terry.reedylinkissue21474 messages
2018-04-30 07:06:37terry.reedycreate