This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Crash when editing emoji containing strings
Type: Stage: resolved
Components: Unicode Versions: Python 3.10
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: 10maurycy10, ezio.melotti, pablogsal, terry.reedy, vstinner
Priority: normal Keywords:

Created on 2021-12-30 19:00 by 10maurycy10, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (5)
msg409379 - (view) Author: M Z (10maurycy10) Date: 2021-12-30 19:00
Reproduction steps:
0. start rpel. command: ``python``
1. enter into  '___😀'
2. use arrow keys and backspace to delete an underscore
3. press enter.
4. observe segfault

- insertion of chars before emoji can also cause crash

back trace (gdb):

```
(gdb) bt
#0  0x00007ffff7495d22 in raise () from /lib/libc.so.6
#1  0x00007ffff747f862 in abort () from /lib/libc.so.6
#2  0x00007ffff747f747 in __assert_fail_base.cold () from /lib/libc.so.6
#3  0x00007ffff748e616 in __assert_fail () from /lib/libc.so.6
#4  0x00007ffff77e7b6e in get_error_line (p=p@entry=0x7ffff6e53b80, lineno=lineno@entry=0) at Parser/pegen.c:438
#5  0x00007ffff7b3660d in _PyPegen_raise_error_known_location (p=p@entry=0x7ffff6e53b80, errtype=errtype@entry=0x7ffff7e7ac00 <_PyExc_SyntaxError>, lineno=0, col_offset=0, end_lineno=0, end_col_offset=-1, errmsg=0x7ffff7c84bcd "(%s) %U",
    va=0x7fffffffdcb0) at Parser/pegen.c:491
#6  0x00007ffff7b36d33 in _PyPegen_raise_error (p=p@entry=0x7ffff6e53b80, errtype=0x7ffff7e7ac00 <_PyExc_SyntaxError>, errmsg=errmsg@entry=0x7ffff7c84bcd "(%s) %U") at Parser/pegen.c:422
#7  0x00007ffff7b37213 in raise_decode_error (p=p@entry=0x7ffff6e53b80) at Parser/pegen.c:271
#8  0x00007ffff7bbed14 in initialize_token (token_type=60, end=0x0, start=<optimized out>, token=0x7ffff6ffb330, p=0x7ffff6e53b80) at Parser/pegen.c:712
#9  _PyPegen_fill_token (p=p@entry=0x7ffff6e53b80) at Parser/pegen.c:785
#10 0x00007ffff7c5eca1 in statement_newline_rule (p=0x7ffff6e53b80) at Parser/parser.c:1521
#11 interactive_rule (p=0x7ffff6e53b80) at Parser/parser.c:994
#12 _PyPegen_parse (p=p@entry=0x7ffff6e53b80) at Parser/parser.c:33180
#13 0x00007ffff7bd5784 in _PyPegen_run_parser (p=p@entry=0x7ffff6e53b80) at Parser/pegen.c:1343
#14 0x00007ffff7bd8e72 in _PyPegen_run_parser_from_file_pointer (fp=fp@entry=0x7ffff7619800 <_IO_2_1_stdin_>, start_rule=start_rule@entry=256, filename_ob=filename_ob@entry=0x7ffff7155490, enc=enc@entry=0x7ffff714bdb0 "utf-8",
    ps1=ps1@entry=0x7ffff7155c40 ">>> ", ps2=ps2@entry=0x7ffff6ffafa0 "... ", flags=0x7fffffffe0f8, errcode=0x7fffffffdfe4, arena=0x7ffff6feb820) at Parser/pegen.c:1440
#15 0x00007ffff7bd903e in _PyParser_ASTFromFile (fp=fp@entry=0x7ffff7619800 <_IO_2_1_stdin_>, filename_ob=filename_ob@entry=0x7ffff7155490, enc=enc@entry=0x7ffff714bdb0 "utf-8", mode=mode@entry=256, ps1=ps1@entry=0x7ffff7155c40 ">>> ",
    ps2=ps2@entry=0x7ffff6ffafa0 "... ", flags=0x7fffffffe0f8, errcode=0x7fffffffdfe4, arena=0x7ffff6feb820) at Parser/peg_api.c:26
#16 0x00007ffff7bd9205 in PyRun_InteractiveOneObjectEx (fp=fp@entry=0x7ffff7619800 <_IO_2_1_stdin_>, filename=filename@entry=0x7ffff7155490, flags=flags@entry=0x7fffffffe0f8) at Python/pythonrun.c:257
#17 0x00007ffff7bda02d in _PyRun_InteractiveLoopObject (fp=0x7ffff7619800 <_IO_2_1_stdin_>, filename=0x7ffff7155490, flags=0x7fffffffe0f8) at Python/pythonrun.c:148
#18 0x00007ffff7bdbbc7 in _PyRun_AnyFileObject (fp=0x7ffff7619800 <_IO_2_1_stdin_>, filename=0x7ffff7155490, closeit=0, flags=0x7fffffffe0f8) at Python/pythonrun.c:84
#19 0x00007ffff7bdbdf3 in PyRun_AnyFileExFlags (fp=0x7ffff7619800 <_IO_2_1_stdin_>, filename=<optimized out>, closeit=0, flags=0x7fffffffe0f8) at Python/pythonrun.c:116
#20 0x00007ffff7bddf60 in pymain_run_stdin (config=0x55555555f910) at Modules/main.c:502
#21 pymain_run_python (exitcode=0x7fffffffe0f0) at Modules/main.c:590
#22 Py_RunMain () at Modules/main.c:666
#23 0x00007ffff7bde683 in pymain_main (args=args@entry=0x7fffffffe250) at Modules/main.c:696
#24 0x00007ffff7bde7f9 in Py_BytesMain (argc=argc@entry=1, argv=argv@entry=0x7fffffffe3a8) at Modules/main.c:720
#25 0x00005555555560af in main (argc=1, argv=0x7fffffffe3a8) at ./Programs/python.c:15
(gdb)
```
msg409380 - (view) Author: M Z (10maurycy10) Date: 2021-12-30 19:15
FYI: My platform is arch linux on amd64.
msg409437 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-12-31 22:29
On Win10 command prompt, 😀 is initially displayed as 2 boxes, subsequently as box-space.  Besides this, editing works fine and when Entered, the string is echoed.
msg409440 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-31 23:55
This seems to be fixed on the main branch (at least I cannot reproduce it in the main branch). If so, this means that will be fixed in Python 3.10.2.

10maurycy10, could you please confirm that this is indeed the case?
msg409441 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-31 23:56
Example:

>>> ___😀
  File "<stdin>", line 1
    ___😀
       ^
SyntaxError: invalid character '😀' (U+1F600)
>>> __😀
  File "<stdin>", line 1
    __😀
      ^
SyntaxError: invalid character '😀' (U+1F600)
History
Date User Action Args
2022-04-11 14:59:54adminsetgithub: 90364
2021-12-31 23:56:33pablogsalsetmessages: + msg409441
2021-12-31 23:55:51pablogsalsetstatus: open -> closed
resolution: out of date
stage: resolved
2021-12-31 23:55:38pablogsalsetmessages: + msg409440
2021-12-31 22:29:58terry.reedysetnosy: + terry.reedy, pablogsal
messages: + msg409437
2021-12-30 19:15:3810maurycy10setmessages: + msg409380
2021-12-30 19:00:2010maurycy10create