Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDLE menu option to convert non-ascii quotes & other? #80400

Open
rhettinger opened this issue Mar 7, 2019 · 5 comments
Open

IDLE menu option to convert non-ascii quotes & other? #80400

rhettinger opened this issue Mar 7, 2019 · 5 comments
Assignees
Labels
3.8 only security fixes topic-IDLE type-feature A feature request or enhancement

Comments

@rhettinger
Copy link
Contributor

BPO 36219
Nosy @rhettinger, @terryjreedy, @serhiy-storchaka, @csabella

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/terryjreedy'
closed_at = None
created_at = <Date 2019-03-07.00:10:03.550>
labels = ['expert-IDLE', 'type-feature', '3.8']
title = 'IDLE menu option to convert non-ascii quotes & other?'
updated_at = <Date 2019-03-09.01:27:51.537>
user = 'https://github.com/rhettinger'

bugs.python.org fields:

activity = <Date 2019-03-09.01:27:51.537>
actor = 'rhettinger'
assignee = 'terry.reedy'
closed = False
closed_date = None
closer = None
components = ['IDLE']
creation = <Date 2019-03-07.00:10:03.550>
creator = 'rhettinger'
dependencies = []
files = []
hgrepos = []
issue_num = 36219
keywords = []
message_count = 5.0
messages = ['337350', '337358', '337429', '337549', '337550']
nosy_count = 4.0
nosy_names = ['rhettinger', 'terry.reedy', 'serhiy.storchaka', 'cheryl.sabella']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'test needed'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue36219'
versions = ['Python 3.8']

@rhettinger
Copy link
Contributor Author

Some of my students routinely have to copy code samples from PDF documents where the regular Python acceptable ASCII quotation marks have been replaced by smart quotes. Let's add an Edit menu option to fix smart-quotes.

@rhettinger rhettinger added the 3.8 only security fixes label Mar 7, 2019
@serhiy-storchaka
Copy link
Member

Also dashes and hyphens to minuses and non-breaking spaces to normal spaces.

@csabella
Copy link
Contributor

csabella commented Mar 7, 2019

Would it be worthwhile to automatically convert the text when it's being pasted or would there be a scenario where it would be desirable to keep these characters in the text? It seems the point here is that the user wouldn't even realize that the quotes (or dashes) being copied aren't the right ones and they would have to learn to take the extra step of formatting the text. That seems annoying, so maybe automatic conversion would eliminate that?

For the menu option route, in the editor there is an additional 'Format' menu which has some text manipulation options, but the Shell doesn't have this menu available. There isn't any formatting options on the 'Edit' menu currently. Would it be better to add a 'Format' menu to the Shell or have this on the 'Edit' menu (which is already getting long)?

For the actual text conversion, I pasted some smart quotes on Windows and it pasted as \u2018\u2018 (two single left quotations marks) and \u2019\u2019 (two single right quotation marks) instead of \u201C (double left) and \u201D (double right). \u0060 (grave accent) and \u00B4 (acute accent) also seem to be possible values that are used for quotes, although converting them automatically may be more problematic.

I think for starters the idea would be:
text.replace('\u2018\u2018', '"')
text.replace('\u2019\u2019', '"')
text.replace('\u2018, "'")
text.replace('\u2019, "'")
text.replace('\u201C, '"')
text.replace('\u201D, '"')

The dash may be more complicated since there are more of them. Unless the category could be used.

@terryjreedy
Copy link
Member

I support adding a new function, with these notes.

  1. Let's limit the scope to actual reversible bugs introduced by 3rd party software we care about. Let's not try to anticipate every possible issue. Also, once we have a function to replace some unicode chars, I can imagine users requesting replacement of other unicode chars, such as math X-like multiplication symbol by '*'. I am pretty sure that encouraging intentional unicode extensions would not pass core-dev review. ;-)

Raymond, do users encounter all of the characters and combinations Cheryl suggested? Serhiy, do you know if real pdfs make the other changes you pointed at? Can you provide or suggest a specific test string?

  1. I want to put the new feature on the Format menu. A. The Edit menu is already overly long and B) the other items on Format already do various selection or whole-text fixups (inserts, replacements, and deletions). Possible menu entry: 'Replace non-ascii chars'. This is 23 chars; the current longest entry is 25. A 'hotkey' is not needed for something so rarely used. (Some of the other items on Format don't need them either.)

I think including Format on the Shell menu, with a subset of entries active, should be a follow-up issue. Another possible follow-up is to check pasted or opened text and offer to edit if appropriate. I am wary of doing so automatically, especially to start.

  1. We should not replace within strings and comments, but mangled strings may be hard to recognize as such. Suppose '’' is mangled to ‘’’ (\u2018\u2019\u2019, open-close-close). I am not sure how we should recognize to leave the middle character as is, except to reject anything that results in a syntax error. I would rather do too few rather than too many edits. I will be happy if we can start with something useful, not wrong, tested, and documented.

@terryjreedy terryjreedy changed the title Add edit option in IDLE to convert smart quotes to ascii quotes IDLE menu option to convert non-ascii quotes & other? Mar 9, 2019
@terryjreedy terryjreedy added the type-feature A feature request or enhancement label Mar 9, 2019
@rhettinger
Copy link
Contributor Author

Raymond, do users encounter all of the characters and combinations Cheryl suggested?

The only recurring issue is with the smart quotes.

For anything else, perhaps there can be a box on the General configuration tab for additional source/dest replacement pairs.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 only security fixes topic-IDLE type-feature A feature request or enhancement
Projects
Status: No status
Development

No branches or pull requests

4 participants