classification
Title: IDLE very slow due a super long line output in chunks
Type: performance Stage:
Components: IDLE Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: terry.reedy Nosy List: Bernie, taleinat, terry.reedy
Priority: normal Keywords:

Created on 2019-08-05 14:11 by Bernie, last changed 2019-08-12 12:53 by taleinat.

Files
File name Uploaded Description Edit
Python374-3.PNG Bernie, 2019-08-05 14:10 screenshot
Python374-4.PNG Bernie, 2019-08-09 07:53 screenshot command line
Messages (7)
msg349050 - (view) Author: Bernhard Hiller (Bernie) Date: 2019-08-05 14:10
After installing tensorflow, I tried to run the demo script found at https://www.tensorflow.org/tutorials?
In a common python shell, the "model.fit(x_train, y_train, epochs=5)" step takes a few minutes. In IDLE, no end is in sight after half an hour.
While the output looks normal in the common shell, IDLE shows some control characters (see attached screenshot).
Windows Task Managers shows a "pythonw.exe" process taking up 25% of CPU (i.e. 1 of 4 cores).
msg349059 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-08-05 17:21
One may copy and paste small chunks of code and output into a message. By 'demo script', I presume you mean the following.

import tensorflow as tf
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

The boxes represent characters that cannot be displayed in the tk Text widget with the current font.  They represent whatever is printed in the standard shell, which you call 'normal', which implies that they are not control chars.  Since the length of the box run is 1 less than the length of the visible data, my guess is that the intended format is alternating lines something like
 9696/60000............................................ acc: 0.8691
==================================================================
except that the line separators are some non-ascii char.  (Note that the sequence of [] substitutes in one less than the sequence of printable ascii, which would make it impossible to set the shell width so that wrapped chunks line up.)

The problem is the absence of '\n' newlines, which I consider a bug in keras. The result is one line that grows in chunks to some humongous length.  This is eventually deadly to tk Text widgets.  The symptom is that chunks are added quickly at first, then more slowly, then to a crawl, and maybe a stop.  Here is a short test.

from __future__ import print_function  # To run with 2.7.
for i in range(100000):
    print('%6s--------------------------------------------' % i, end='')

On my Win10 machine with 3.8.0b3 and tk 8.6.9 (see Help => About IDLE for the latter), slow down is evident after 10000 chars (200 chunks) and crawling by 40000 chars.  It might be worse on Linux and Mac.

I added a note about auto-squeezing the long lines in chunks case to #35855.  I expect to close this as 3rd party, or as a duplicate of #1442493.
msg349109 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2019-08-06 14:27
IDLE in general doesn't recognize and support control characters commonly used in terminals.  This is often a problem with running things that show a progress bars, which usually print "\r" to return the cursor to the beginning of the line and then overwrite the line, over and over again.  Since IDLE doesn't support this properly, what you get instead is all of the progress output one after another on a single line.

To make matters worse, very long lines make IDLE's shell increasingly slow, at worst becoming almost entirely unresponsive.  This compounds the issue with progress bars.

It would be interesting to see what Jupyter does in these cases, since apparently such examples work well in Jupyter.  Perhaps we can do something similar.
msg349110 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2019-08-06 14:44
So, Jupyter notebook has special support for carriage-return ('\r') and backspace ('\b') characters[1].

Do we want to consider adding similar support in IDLE?

[1] https://github.com/jupyter/notebook/blob/e498de6775b3de57f5ff827e562c179b17893064/notebook/static/base/js/utils.js#L479
msg349140 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-08-06 23:34
For the general issue of interpreting control chars, see #23220 (and duplicate #24572).  IDLE is a development environment, not a production environment.

If Tal's hypothesis is correct, a solution for Bernard would be a new addition I want for Run Customized: 'Run with System Terminal'.  The idea is that people who know the IDLE editor should be able to develop terminal-specific code in IDLE.  Instead of Save, open or switch to Terminal/Command Prompt, enter 'python -i path/to/file.py, switch back to editor, one could hit Shift-F5, click [ ] Run with Terminal, click [OK].  The use cases are use of functions like Windows-only kbhit() and interpretation of output chars as on the default system console/terminal. The resolution for this issue should be that I open an issue so that others can contribute to doing this.

As for what is happening with the keras output:  The block character appears to be the Inverse Bullet character.  Here are 1 and 3: '◘ ◘◘◘'. In tk Text on Windows, the height and width depends on the font.  In some, it is twice tall as wide, as at https://www.compart.com/en/unicode/U+25D8.

Here are 1 and 3 \b backspaces: ' '.  Firefox displays each as a square box with 0 0 0 8.  On Windows, tk displays \b and many other control chars as a White Vertical Rectangle, https://www.compart.com/en/unicode/U+25af.  Here are 1 and 3: '▯ ▯▯▯', except that when representing backspaces, there is not the wide space between.

In the png, the printable ascii runs overlap by 7 while the blob runs overlap by 6.  If the blobs were \bs, there would seem to be 1 too few.  Until Bernard copies and pastes the 'normal' output, or exactly describes it, if dynamic, and copies and pastes the 'wrong' tk output, I don't want to speculate further on what he saw.
msg349276 - (view) Author: Bernhard Hiller (Bernie) Date: 2019-08-09 07:53
Please find enclosed a screen shot of the command line, when the same script is run there.
If you want to perform the tests yourself in order to get more information about the type of those characters, you may simply run the script mentioned in my first post.

Furthermore, when running the script from the Python command prompt, I see a process called "python.exe" (without "w") in the task manager, taking some 70% of CPU.
msg349463 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2019-08-12 12:53
With PR GH-15211 (for issue #37827), which implements terminal-like handling of the \r and \b control characters, the TensorFlow tutorial almost works as intended in the IDLE shell; a minor change to Tensorflow is needed to make it work as intended.

Tensorflow inspect sys.stdout to decide whether to use \r and \b control characters is its progress output. Since it recognizes that IDLE's sys.stdout replacement isn't a terminal, it decides not to. Overriding the _dynamic_display attribute of the Progbar class in tensorflow/python/keras/utils/generic_utils.py makes it work very nicely with the above mentioned PR.

(I checked this on Windows; it is possible that it will "just work" on other OSs.)
History
Date User Action Args
2019-08-12 12:53:59taleinatsetmessages: + msg349463
2019-08-09 07:53:36Berniesetfiles: + Python374-4.PNG

messages: + msg349276
2019-08-06 23:34:35terry.reedysetmessages: + msg349140
2019-08-06 14:44:44taleinatsetmessages: + msg349110
2019-08-06 14:27:39taleinatsetnosy: + taleinat
messages: + msg349109
2019-08-05 17:21:22terry.reedysetmessages: + msg349059
title: IDLE very slow due to special characters -> IDLE very slow due a super long line output in chunks
2019-08-05 14:11:00Berniecreate