Title: regrtest: win_utils decodes typeperf output from the wrong encoding (test suite broken due to cpu usage feature on win 10/ german)
Type: behavior Stage: patch review
Components: Tests, Windows Versions: Python 3.9, Python 3.8, Python 3.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: LorenzMende, ammar2, eryksun, heckad, paul.moore, skoslowski, steve.dower, tim.golden, vstinner, zach.ware
Priority: normal Keywords: easy, newcomer friendly, patch

Created on 2019-04-19 15:52 by LorenzMende, last changed 2019-08-25 08:45 by LorenzMende.

File name Uploaded Description Edit LorenzMende, 2019-06-09 09:02 patched with registry lookup for counter name and oemdecoding
Pull Requests
URL Status Linked Edit
PR 15488 open LorenzMende, 2019-08-25 08:45
Messages (13)
msg340547 - (view) Author: Lorenz Mende (LorenzMende) * Date: 2019-04-19 15:52
The test suite fails with the first tests (I assume 1st call of getloadavg of WindowsLoadTracker).
Traceback (most recent call last):
  File "P:\Repos\CPython\cpython\lib\", line 192, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "P:\Repos\CPython\cpython\lib\", line 85, in _run_code
    exec(code, run_globals)
  File "P:\Repos\CPython\cpython\lib\test\", line 2, in <module>
  File "P:\Repos\CPython\cpython\lib\test\libregrtest\", line 653, in main
    Regrtest().main(tests=tests, **kwargs)
  File "P:\Repos\CPython\cpython\lib\test\libregrtest\", line 586, in main
    self._main(tests, kwargs)
  File "P:\Repos\CPython\cpython\lib\test\libregrtest\", line 632, in _main
  File "P:\Repos\CPython\cpython\lib\test\libregrtest\", line 515, in run_tests
  File "P:\Repos\CPython\cpython\lib\test\libregrtest\", line 396, in run_tests_sequential
    self.display_progress(test_index, text)
  File "P:\Repos\CPython\cpython\lib\test\libregrtest\", line 150, in display_progress
    load_avg_1min = self.getloadavg()
  File "P:\Repos\CPython\cpython\lib\test\libregrtest\", line 81, in getloadavg
    typeperf_output = self.read_output()
  File "P:\Repos\CPython\cpython\lib\test\libregrtest\", line 78, in read_output
    return response.decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 67: invalid start byte
The windows 'typeperf "\System\Processor Queue Length" -si 1' command unluckily returns an string with an umlaut which leads to the Decode-Error. This comes up because the <counter> for the typeperf is location dependend. (In german the counter would read \System\Prozessor-Warteschlangenlänge)

I see two possible solutions to this issue.
1. Raising an exception earlier on creation of WindowsLoadTracker resulting in the same behaviour as if there is no typeperf available (german pythoneers would have a drawback with this)
2. Getting the typeperf counter correctly from registry (HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Perflib\CurrentLanguage, described here

Windows 10 x64, 1809, german
cpython @e16467af0bfcc9f399df251495ff2d2ad20a1669
commit of assumed root cause of
msg340548 - (view) Author: Ammar Askar (ammar2) * (Python triager) Date: 2019-04-19 16:26
What does `typeperf "\System\Processor Queue Length" -si 1` actually return on your non-English system?

Does it just return an error with the counter's name or is the umalet just in the first header line, i.e this one for me:

  "(PDH-CSV 4.0)","\\MSI\System\Processor Queue Length"

If it's the latter then I think the correct fix would be to figure out what encoding typeperf is outputting in and then just decode with that.
msg340550 - (view) Author: Lorenz Mende (LorenzMende) * Date: 2019-04-19 17:11
Hi Ammar, you are correct.
typeperf returns:
P:\Repos\CPython\cpython>typeperf "\System\Prozessor-Warteschlangenlänge" -si 1

"(PDH-CSV 4.0)","\\ZERO\System\Prozessor-Warteschlangenlänge"
"04/19/2019 19:09:14.510","0.000000"
"04/19/2019 19:09:15.514","0.000000"

So even with correct counter name the outpu needs to be decoded correctly.
I already got a solution to get the location specific counter name from registry - if it helps I'll commit it.
msg340551 - (view) Author: Ammar Askar (ammar2) * (Python triager) Date: 2019-04-19 17:13
Thank you, could you also share the output if you just give it the English name of the counter?
msg340562 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-04-20 06:41
'crash' mean *nix coredump or Windows equivelent, rather than traceback.
msg340575 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-04-20 14:39
It'll probably be mbcs or oem encoding. Certainly not UTF-8
msg340576 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-04-20 15:09
> It'll probably be mbcs or oem encoding. Certainly not UTF-8

Unless the system locale's codepage is set to UTF-8 in Windows 10. 

I stepped through this in the debugger. When writing to a pipe, typeperf.exe calls WideCharToMultiByte(CP_OEMCP, ...), so it's "oem".
msg340632 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-04-22 02:17
If the code page is set to UTF-8 then mbcs will still be the right encoding :)
msg340797 - (view) Author: Lorenz Mende (LorenzMende) * Date: 2019-04-24 17:49
Sorry, was off some days. I tried to decode the output with mbcs, solves the issue partly - the counter name is still wrong.
Was able to pick the localization specific counter name from registry and use it for the typeperf.

But the tests fail after several seconds with Broken Pipe Error of the command_stdout pipe.

@Ammar - why is the handle closed in start()? If I uncomment it, the load tracker workks fine.

Added the fixed file, may someone evaluate the solution on another windows localization?
msg340806 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-04-24 19:57
It's "oem", not "mbcs".
msg344293 - (view) Author: Андрей Казанцев (heckad) * Date: 2019-06-02 15:57
I have the same problem in Russian locale.
Adding "oem" encoding in the decode method solve problem with decoding but got:
File "C:\Users\User\Documents\Projects\cpython\lib\test\libregrtest\", line 98, in getloadavg
    load = float(toks[1].replace('"', ''))
ValueError: could not convert string to float

In typeperf_output text with description of the error:
'\r\nВыполняется выход, подождите...                         \r\nОшибка: Счетчики не указаны.\r\n\r\r'
Translation on English is "Exiting, wait... Error: Counters are not specified."

Is it possible to check if the counters are found in advance?
msg345076 - (view) Author: Lorenz Mende (LorenzMende) * Date: 2019-06-09 09:02
Did some minor changes with win_utils.
Encoding changed to 'oem'.
@heckad: does this bugfix work for u?

@all: please verify working of the changes with different localizations.
msg348719 - (view) Author: Sebastian Koslowski (skoslowski) * Date: 2019-07-30 08:08
I ran into this issue on a Win10 German box running the tests for version 3.7.4

Your changes fixes the issue for me.
Date User Action Args
2019-08-25 08:45:21LorenzMendesetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request15175
2019-08-21 10:58:51vstinnersettitle: test suite broken due to cpu usage feature on win 10/ german -> regrtest: win_utils decodes typeperf output from the wrong encoding (test suite broken due to cpu usage feature on win 10/ german)
2019-08-21 10:57:39vstinnersetnosy: + vstinner
2019-08-17 21:11:42steve.dowersetkeywords: + easy, newcomer friendly
2019-07-30 15:23:21steve.dowersetstage: needs patch
versions: + Python 3.8, Python 3.9
2019-07-30 08:08:20skoslowskisetnosy: + skoslowski

messages: + msg348719
versions: + Python 3.7, - Python 3.8
2019-06-09 09:03:08LorenzMendesetfiles: -
2019-06-09 09:02:57LorenzMendesetfiles: +

messages: + msg345076
2019-06-02 19:00:02terry.reedysetnosy: - terry.reedy
2019-06-02 15:57:53heckadsetnosy: + heckad
messages: + msg344293
2019-04-24 19:57:20eryksunsetmessages: + msg340806
2019-04-24 17:49:55LorenzMendesetfiles: +

messages: + msg340797
2019-04-22 02:17:50steve.dowersetmessages: + msg340632
2019-04-20 15:09:47eryksunsetnosy: + eryksun
messages: + msg340576
2019-04-20 14:39:46steve.dowersetmessages: + msg340575
2019-04-20 06:41:05terry.reedysetnosy: + terry.reedy, paul.moore, tim.golden, zach.ware, steve.dower
messages: + msg340562

components: + Windows
type: crash -> behavior
2019-04-19 17:13:40ammar2setmessages: + msg340551
2019-04-19 17:11:57LorenzMendesetmessages: + msg340550
2019-04-19 16:26:07ammar2setnosy: + ammar2
messages: + msg340548
2019-04-19 15:54:02LorenzMendesetnosy: - ammar2
2019-04-19 15:53:34LorenzMendesetnosy: + ammar2
2019-04-19 15:52:37LorenzMendecreate