Title: nonASCII punctuation characters can not display in python363.chm.
Type: behavior Stage:
Components: Windows Versions: Python 3.6
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Ma Lin, mdk, paul.moore, r.david.murray, steve.dower, tim.golden, wwqgtxx, zaazbb, zach.ware
Priority: normal Keywords:

Created on 2017-11-30 03:45 by zaazbb, last changed 2018-03-13 16:33 by steve.dower.

File name Uploaded Description Edit
1512013191(1).jpg zaazbb, 2017-11-30 03:45
screenshot.PNG Ma Lin, 2018-03-11 11:04
Messages (7)
msg307277 - (view) Author: zaazbb (zaazbb) Date: 2017-11-30 03:45
In chm(python363.chm) documents, some unicode chars (non ascii chars) can not display.
for example:

asyncio — Asynchronous I/O, event loop, coroutines and tasks

displayed as

asyncio � Asynchronous I/O, event loop, coroutines and tasks


Asynchronous programming is more complex than classical “sequential” programming

display as

Asynchronous programming is more complex than classical 搒equential� programming

windows 10, simplified chinese language.
python3.6.3, python363.chm.
msg307438 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2017-12-02 16:37
I'm not sure there will be any good fix for this. We might be able to coerce proper utf-8 output from Sphinx, and if it also adds the encoding tags required by whatever ancient version of Internet Explorer is used then it should be fine

It's likely just best to avoid special punctuation in doc source files though.
msg307460 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-12-02 21:26
The doc source files do not contain smart quotes, and as far as I know, sphinx does produce correct utf-8.

Recently there was a bug where incorrect smart quotes were leaking out of the internationalization of the docs, so this might be a problem that is already fixed.  On the other hand, there might be something broken about the chm production process.  I have no idea who would be the right person to investigate that, since I think Steve just spins the wheel on existing tools to get them generated :)

On the gripping hand, could there be something broken about your local charset configuration?  Does anyone else see this problem?
msg309671 - (view) Author: wwq (wwqgtxx) Date: 2018-01-08 15:56
I found the problem was not fixed on python364.chm but it show well on python362.chm, maybe the official config was a change to let the coding error.
msg313596 - (view) Author: Ma Lin (Ma Lin) * Date: 2018-03-11 10:59
Here is a solution:
1, open a page(whatever) with Internet Explorer.
2, right click the page -> Encoding -> check "Auto-Select"
Then the wrong characters (�/抯) will disappear forever.

> Does anyone else see this problem?
Probably a lot of people have this problem.
I installed a clean Windows 10 recently, I believe it's the default visual effect of Python .chm document.
BTW my local is Simplified Chinese.
msg313637 - (view) Author: Ma Lin (Ma Lin) * Date: 2018-03-12 09:29
The source code of .chm changed between 3.6.2 and 3.6.3, the former uses escaped html entities.
I couldn't find out which commit caused this change.

3.6.2 chm: <h1>What&#8217;s New In Python 3.6</h1>
3.6.3 chm: <h1>What抯 New In Python 3.6</h1>

3.6.2 chm: <h2>Summary &#8211; Release highlights</h2>
3.6.3 chm: <h2>Summary ?Release highlights</h2>

Release date:
3.6.2 final: 2017-07-17
3.6.3 final: 2017-10-03
msg313763 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2018-03-13 16:33
We should probably prefer to force ASCII with explicit escapes (ideally named escapes, rather than codepoints). I'm not sure how to make Sphinx/docutils do that, but presumably it could be our own extension that handles the problematic characters people add to our docs.
Date User Action Args
2018-03-13 16:33:47steve.dowersetmessages: + msg313763
2018-03-12 09:29:17Ma Linsetmessages: + msg313637
2018-03-11 11:04:58Ma Linsetfiles: + screenshot.PNG
2018-03-11 10:59:32Ma Linsetnosy: + Ma Lin
messages: + msg313596
2018-01-08 15:56:20wwqgtxxsetnosy: + wwqgtxx
messages: + msg309671
2017-12-02 21:31:29ned.deilysetnosy: + mdk
2017-12-02 21:26:00r.david.murraysetnosy: + r.david.murray
messages: + msg307460
2017-12-02 16:37:37steve.dowersetmessages: + msg307438
2017-11-30 03:45:30zaazbbcreate