classification
Title: Pydoc fails with codecs
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: larry Nosy List: Yury.Selivanov, doerwalter, larry, lemburg, ncoghlan, python-dev, serhiy.storchaka, yselivanov
Priority: release blocker Keywords: patch

Created on 2015-08-07 10:30 by serhiy.storchaka, last changed 2015-08-09 10:04 by larry. This issue is now closed.

Files
File name Uploaded Description Edit
inspect.patch yselivanov, 2015-08-07 16:10 review
larry.fix.pydoc.for.calls.with.functions.1.diff larry, 2015-08-08 09:25 review
codecs_ac.diff yselivanov, 2015-08-08 15:48
codecs_default_encoding.patch serhiy.storchaka, 2015-08-08 20:22 review
Messages (15)
msg248184 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-08-07 10:30
Pydoc fails with the codecs module in 3.5+. All works in 3.4.

$ ./python -m pydoc codecs
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-3.5/Lib/runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/serhiy/py/cpython-3.5/Lib/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/serhiy/py/cpython-3.5/Lib/pydoc.py", line 2648, in <module>
    cli()
  File "/home/serhiy/py/cpython-3.5/Lib/pydoc.py", line 2613, in cli
    help.help(arg)
  File "/home/serhiy/py/cpython-3.5/Lib/pydoc.py", line 1895, in help
    elif request: doc(request, 'Help on %s:', output=self._output)
  File "/home/serhiy/py/cpython-3.5/Lib/pydoc.py", line 1632, in doc
    pager(render_doc(thing, title, forceload))
  File "/home/serhiy/py/cpython-3.5/Lib/pydoc.py", line 1625, in render_doc
    return title % desc + '\n\n' + renderer.document(object, name)
  File "/home/serhiy/py/cpython-3.5/Lib/pydoc.py", line 370, in document
    if inspect.ismodule(object): return self.docmodule(*args)
  File "/home/serhiy/py/cpython-3.5/Lib/pydoc.py", line 1160, in docmodule
    contents.append(self.document(value, key, name))
  File "/home/serhiy/py/cpython-3.5/Lib/pydoc.py", line 372, in document
    if inspect.isroutine(object): return self.docroutine(*args)
  File "/home/serhiy/py/cpython-3.5/Lib/pydoc.py", line 1345, in docroutine
    signature = inspect.signature(object)
  File "/home/serhiy/py/cpython-3.5/Lib/inspect.py", line 2988, in signature
    return Signature.from_callable(obj, follow_wrapped=follow_wrapped)
  File "/home/serhiy/py/cpython-3.5/Lib/inspect.py", line 2738, in from_callable
    follow_wrapper_chains=follow_wrapped)
  File "/home/serhiy/py/cpython-3.5/Lib/inspect.py", line 2229, in _signature_from_callable
    skip_bound_arg=skip_bound_arg)
  File "/home/serhiy/py/cpython-3.5/Lib/inspect.py", line 2061, in _signature_from_builtin
    return _signature_fromstr(cls, func, s, skip_bound_arg)
  File "/home/serhiy/py/cpython-3.5/Lib/inspect.py", line 2009, in _signature_fromstr
    p(name, default)
  File "/home/serhiy/py/cpython-3.5/Lib/inspect.py", line 1991, in p
    default_node = RewriteSymbolics().visit(default_node)
  File "/home/serhiy/py/cpython-3.5/Lib/ast.py", line 245, in visit
    return visitor(node)
  File "/home/serhiy/py/cpython-3.5/Lib/ast.py", line 310, in generic_visit
    new_node = self.visit(old_value)
  File "/home/serhiy/py/cpython-3.5/Lib/ast.py", line 245, in visit
    return visitor(node)
  File "/home/serhiy/py/cpython-3.5/Lib/inspect.py", line 1978, in visit_Attribute
    return wrap_value(value)
  File "/home/serhiy/py/cpython-3.5/Lib/inspect.py", line 1965, in wrap_value
    raise RuntimeError()
RuntimeError
msg248202 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2015-08-07 16:10
This is related to Arguments Clinic and Larry's implementation of signature parsing for built-in functions.

This particular bug is caused by 'codecs.encode' & 'codecs.decode' functions with the AC signatures defined as follows:

  _codecs.encode
      obj: object
      encoding: str(c_default="NULL") = sys.getdefaultencoding()
      errors: str(c_default="NULL") = "strict"

"encoding" argument's default is a method call, and _signature_fromstr fails to recognize method calls appropriately.

The attached patch fixes the problem by rendering such default values as "<method_name()>".  I don't think we should evaluate the method call anyways, because it can cause strange side effects and can be just plain wrong -- like in this issue -- we shouldn't render 'encoding="utf-8"' just because that's how docs.python.org server is configured.

Anyways, the patch isn't pretty, but does fix the problem with minimal code.  Another option would be to fix codecs.encode and codecs.decode signatures to "encoding: None" and edit documentation accordingly.

Assigning this issue to Larry for his review.
msg248257 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2015-08-08 09:25
This is a legitimate problem and I'd definitely like it fixed.
However, the angle brackets and the quote marks are ugly:

        decode(obj, encoding='<sys.getdefaultencoding()>', errors='strict')

Attached is a tweaked version of the patch that sidesteps the quote marks and the angle brackets, by substituting in an object with a custom repr.

Yury, if my change to your patch looks good to you, please go ahead and check it in. That way it won't slow down 3.5.0rc1.  Thanks!
msg248259 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-08-08 10:36
How with more complex expressions, like "sys.getdefaultencoding() or 'utf-8'".
msg248277 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2015-08-08 15:48
Larry, Serhiy,

After giving this some thought I think that my initial patch is a wrong approach here -- inspect module should not be touched to fix this issue.  

With this patch applied, signature object for codecs.encode would have a Parameter with a bogus default value, breaking functions like 'BoundArguments.apply_defaults()' etc.  In other words, whatever AC puts in 'signature.parameters['encoding'].default' must be an object that will be accepted by the function.

codecs.encode, if implemented in Python, would look like:

   def encode(obj, encoding=None, errors='strict'):
       if encoding is None:
           encoding = sys.getdefaultencoding()
       ...

And that's how the signature should be defined for the C version (because that's what is actually happening in C code as well!)

The new patch changes the AC specs from

  _codecs.encode
      obj: object
      encoding: str(c_default="NULL") = sys.getdefaultencoding()
      errors: str(c_default="NULL") = "strict"

to

  _codecs.encode
      obj: object
      encoding: str(accept={str, NoneType}) = NULL
      errors: str(c_default="NULL") = "strict"

(docstring is updated too).

This change, by the way, is in accordance with PEP 436:

   The values supplied for these [default] parameters must be compatible with ast.literal_eval.
msg248289 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2015-08-08 18:27
Can we do better?  How about a new field on the Parameter object, "symbolic_default_value", which shows you the expression used to compute the value?  We could then set default_value to the result of the expression, pydoc could print the symbolic expression, and codecs encode/decode could be more straightforward.
msg248290 - (view) Author: Yury Selivanov (Yury.Selivanov) * Date: 2015-08-08 19:10
> Can we do better?  How about a new field on the Parameter object, "symbolic_default_value", which shows you the expression used to compute the value?  We could then set default_value to the result of the expression, pydoc could print the symbolic expression, and codecs encode/decode could be more straightforward.

Maybe it's a good idea, but I'm -1 on pushing new APIs to 3.5 without some careful consideration at this super late stage (and it's not going to be a small change btw).

Let's just fix the codecs module.
msg248292 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2015-08-08 19:53
I just tried every module, simulating pydoc on it.  codecs is the only one that failed.  So, I can live with just changing codecs for now.  But let's do it properly in 3.6.  Go ahead and check in for 3.5--or, if you don't get it done before I want to tag the release, I'll do it.

Explain one thing to me, though--how would docs.python.org get "utf-8" for the encoding argument to encode/decode?  The docs are built from .rst files, not from introspection on a live interpreter.
msg248294 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-08-08 19:55
sys.getdefaultencoding() always returns "utf-8" in 3.x (it left for compatibility with 2.x). I suggested to set defaults to literal "utf-8". This matches documentation and signatures of str.encode() and bytes.decode().
msg248296 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2015-08-08 20:04
That's a fine change for 3.5.  For 3.6 I want to solve the general problem, at which point we can switch back to calling sys.getdefaultencoding() if we like.

Serhiy, can you make a patch a post it here?  I want to tag 3.5.0rc1 in one or two hours.
msg248297 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-08-08 20:22
The patch is easy.

In future we should unify docstrings for codecs.encode() and codecs.decode() with str.encode(), bytes.decode() and like.
msg248298 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2015-08-08 20:23
Please use encoding='utf-8' as definition for codecs.encode() and codecs.decode().

There is no adjustable default encoding in Python 3 anymore.

For Python 3.6 this should probably be fixed everywhere.
msg248300 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2015-08-08 20:58
Please change "Default encoding is" to "The default encoding is".  Apart from that, LGTM, please check in!
msg248315 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-08-09 09:25
New changeset bdd1df816f84 by Serhiy Storchaka in branch '3.5':
Issue #24824: Signatures of codecs.encode() and codecs.decode() now are
https://hg.python.org/cpython/rev/bdd1df816f84

New changeset ad65cad76331 by Serhiy Storchaka in branch 'default':
Issue #24824: Signatures of codecs.encode() and codecs.decode() now are
https://hg.python.org/cpython/rev/ad65cad76331
msg248317 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2015-08-09 10:04
Fixed.  Thanks, Serhiy!
History
Date User Action Args
2015-08-09 10:04:40larrysetstatus: open -> closed
resolution: fixed
messages: + msg248317
2015-08-09 09:25:58python-devsetnosy: + python-dev
messages: + msg248315
2015-08-08 20:58:00larrysetmessages: + msg248300
2015-08-08 20:23:20lemburgsetmessages: + msg248298
2015-08-08 20:22:14serhiy.storchakasetfiles: + codecs_default_encoding.patch

messages: + msg248297
2015-08-08 20:04:19larrysetmessages: + msg248296
2015-08-08 19:55:49serhiy.storchakasetmessages: + msg248294
2015-08-08 19:53:24larrysetmessages: + msg248292
2015-08-08 19:10:47Yury.Selivanovsetnosy: + Yury.Selivanov
messages: + msg248290
2015-08-08 18:27:31larrysetmessages: + msg248289
2015-08-08 15:48:18yselivanovsetfiles: + codecs_ac.diff
nosy: + ncoghlan
messages: + msg248277

2015-08-08 10:36:47serhiy.storchakasetmessages: + msg248259
2015-08-08 09:25:36larrysetfiles: + larry.fix.pydoc.for.calls.with.functions.1.diff

messages: + msg248257
2015-08-07 16:10:12yselivanovsetfiles: + inspect.patch
priority: normal -> release blocker

assignee: larry

keywords: + patch
nosy: + larry
messages: + msg248202
stage: patch review
2015-08-07 10:30:32serhiy.storchakacreate