classification
Title: interpreter hangs forever on invalid input
Type: behavior Stage: resolved
Components: Unicode Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: dgan, eryksun, ezio.melotti, vstinner
Priority: normal Keywords:

Created on 2020-10-12 18:50 by dgan, last changed 2020-10-13 07:21 by dgan. This issue is now closed.

Messages (5)
msg378514 - (view) Author: denis (dgan) Date: 2020-10-12 18:50
Python 3.8.5, Python 3.7.3
When trying to print invalid unicode, interpreter goes to a completely unresponsive state

>>> print("\x9b")


eyboardInterrupt
>>> print("\xa5")
¥
>>> print("\x95")

>>> print("\x90")
denis@debian:~$ <interpreter hangs forever, ctrl+d was pressed to exit>
msg378517 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2020-10-12 19:21
The terminal you're using apparently implements C1 controls [1]. U+009B is the Control Sequence Introducer (CSI) for ANSI escape sequences. U+0090 starts a Device Control String (DCS), and it gets terminated by U+009C, a String Terminator (ST). For example, in GNOME Terminal in Linux:

    >>> print('spam\x9b4Deggs')
    eggs

    >>> print('spam\x90some DCS string\x9c')
    spam

---

[1] https://en.wikipedia.org/wiki/C0_and_C1_control_codes
msg378519 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2020-10-12 19:27
In Windows 10 2004, CSI is currently supported by the console host (conhost.exe) if virtual-terminal mode is enabled. Windows Terminal Preview supports many more C1 control codes, as will conhost.exe in the next release of Windows 10. This should be 'fun' considering the C1 controls block isn't reserved by Windows filesystems. CMD and PowerShell depend on the C0 controls being reserved and depend on the C1 controls not being implemented. They don't implement any escaping of control characters when listing filenames in a directory, unlike a tradition POSIX shell. Now that the console and Windows Terminal implement C1 controls, maybe PowerShell can be updated to escape them in file listings, but I don't see what can be done for CMD, which doesn't support string literals with escape sequences.
msg378531 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-10-12 22:21
I don't see anything wrong with Python itself and I suggest to close the issue as "not a bug".
msg378544 - (view) Author: denis (dgan) Date: 2020-10-13 07:21
I do confirm that different terminals react differently (xterm doesn't hang)

Definitely not a python bug
History
Date User Action Args
2020-10-13 07:21:19dgansetstatus: open -> closed
resolution: not a bug
messages: + msg378544

stage: resolved
2020-10-12 22:21:20vstinnersetmessages: + msg378531
2020-10-12 19:27:00eryksunsetmessages: + msg378519
2020-10-12 19:21:57eryksunsetnosy: + eryksun
messages: + msg378517
2020-10-12 18:50:52dgancreate