This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: pydoc does not handle non-ASCII unicode AUTHOR field
Type: behavior Stage: resolved
Components: Unicode Versions: Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: Fix pydoc crashing on unicode strings
View: 1065986
Assigned To: Nosy List: eric.araujo, ezio.melotti, maker, r.david.murray
Priority: normal Keywords:

Created on 2012-08-27 12:28 by maker, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (4)
msg169196 - (view) Author: Michele Orrù (maker) * Date: 2012-08-27 12:28
$ echo "__author__ = u'Michele Orr\xf9'" > foo.py && python -c "import foo; print foo.__author__; help(foo)"
Michele Orrù
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site.py", line 467, in __call__
    return pydoc.help(*args, **kwds)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pydoc.py", line 1747, in __call__
    self.help(request)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pydoc.py", line 1794, in help
    else: doc(request, 'Help on %s:')
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pydoc.py", line 1531, in doc
    pager(render_doc(thing, title, forceload))
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pydoc.py", line 1526, in render_doc
    return title % desc + '\n\n' + text.document(object, name)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pydoc.py", line 329, in document
    if inspect.ismodule(object): return self.docmodule(*args)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pydoc.py", line 1126, in docmodule
    result = result + self.section('AUTHOR', str(object.__author__))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf9' in position 11: ordinal not in range(128)
msg169203 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-08-27 15:35
I think this works if you set PYTTHONIOENCODING=utf-8 in your environment.

(2.6 does not get bugfixes anymore.)
msg169204 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-08-27 15:38
One could argue that since print does respect the terminal encoding if sys.stdin is a tty, pydoc could be as smart and do the same.  I think the problem comes from the use of a pager, which means a subprocess, which mean that the streams are not ttys and the encoding can’t be detected.  print doesn’t work either with pipes: python -c "import foo; print foo.__author__" | cat

So I fear that this bug may only get a doc note.
msg169208 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-08-27 16:38
There is a proposed patch in issue 1065986 waiting for review.
History
Date User Action Args
2022-04-11 14:57:35adminsetgithub: 59995
2012-08-27 16:38:22r.david.murraysetstatus: open -> closed

superseder: Fix pydoc crashing on unicode strings
nosy: + r.david.murray

messages: + msg169208
type: crash -> behavior
resolution: duplicate
stage: resolved
2012-08-27 15:38:39eric.araujosetmessages: + msg169204
2012-08-27 15:35:49eric.araujosettitle: pydoc does not handle unicode AUTHOR field -> pydoc does not handle non-ASCII unicode AUTHOR field
messages: + msg169203
versions: - Python 2.6
2012-08-27 12:28:43makercreate