This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: 3.10 objects.inv classifies many types as data
Type: Stage: patch review
Components: Documentation Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: bskinn, docs@python, dom1310df, gaborjbernat, kj, lukasz.langa, miss-islington
Priority: normal Keywords: patch

Created on 2021-10-06 12:20 by gaborjbernat, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 28757 merged gaborjbernat, 2021-10-06 12:38
PR 30004 merged miss-islington, 2021-12-09 12:56
Messages (11)
msg403300 - (view) Author: gaborjbernat (gaborjbernat) * Date: 2021-10-06 12:20
It's a class though:

❯ sphobjinv suggest ./objects.inv UnionType
:py:data:`types.UnionType`

defined as:

UnionType = type(int | str)
msg403301 - (view) Author: gaborjbernat (gaborjbernat) * Date: 2021-10-06 12:45
The issue with the current state this is that intersphinx fails to find types.UnionType in objects.inv because of leaves under the incorrect namespace (data vs class).
msg403308 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-10-06 15:03
On the PR you mention there's more types with this problem. Can we get a full list?
msg403310 - (view) Author: gaborjbernat (gaborjbernat) * Date: 2021-10-06 15:08
Not easily, but, e.g. the EllipsisType is one. I would have to create some script which I haven't done yet. 

The best would be to create a sphinx plugin that collects entries registered in the doc and displays the discrepancy against the intershpinx object. This way, we could defend against future such issues too. I can give it a go in a few days if no one else wants to do so before that.
msg403311 - (view) Author: Brian Skinn (bskinn) * Date: 2021-10-06 15:13
If I understand the problem correctly, these mis-attributions of roles (to 'data' instead of 'class' come about when a thing that is technically a class is defined in source using simple assignment, as with UnionType.

Problematic entries will thus have 'data' as role, and their identifiers will be camel-cased.

So, as a quick search to identify likely candidates:


>>> import re, sphobjinv as soi
>>> from pprint import pprint
>>> inv = soi.Inventory(url="https://docs.python.org/3.10/objects.inv")

# Find entries where the first character after the final period
# is uppercase, and the second character after the final period
# is lowercase.
>>> pat = re.compile(r"([.][A-Z][a-z])[^.]*$")

>>> pprint([obj.name for obj in inv.objects if obj.role == "data" and pat.search(obj.name)])

['_thread.LockType',
 'ast.PyCF_ALLOW_TOP_LEVEL_AWAIT',
 'ast.PyCF_ONLY_AST',
 'ast.PyCF_TYPE_COMMENTS',
 'importlib.resources.Package',
 'importlib.resources.Resource',
 'socket.SocketType',
 'types.AsyncGeneratorType',
 'types.BuiltinFunctionType',
 'types.BuiltinMethodType',
 'types.CellType',
 'types.ClassMethodDescriptorType',
 'types.CoroutineType',
 'types.EllipsisType',
 'types.FrameType',
 'types.FunctionType',
 'types.GeneratorType',
 'types.GetSetDescriptorType',
 'types.LambdaType',
 'types.MemberDescriptorType',
 'types.MethodDescriptorType',
 'types.MethodType',
 'types.MethodWrapperType',
 'types.NoneType',
 'types.NotImplementedType',
 'types.UnionType',
 'types.WrapperDescriptorType',
 'typing.Annotated',
 'typing.Any',
 'typing.AnyStr',
 'typing.Callable',
 'typing.ClassVar',
 'typing.Concatenate',
 'typing.Final',
 'typing.Literal',
 'typing.NoReturn',
 'typing.Optional',
 'typing.ParamSpecArgs',
 'typing.ParamSpecKwargs',
 'typing.Tuple',
 'typing.TypeAlias',
 'typing.TypeGuard',
 'typing.Union',
 'weakref.CallableProxyType',
 'weakref.ProxyType',
 'weakref.ProxyTypes',
 'weakref.ReferenceType']


I would guess those 'ast.PyCF...' objects can be ignored, they appear to be constants?
msg403313 - (view) Author: Brian Skinn (bskinn) * Date: 2021-10-06 15:17
Identifiers starting with two uppercase letters returns a HUGE list.

>>> pat2 = re.compile(r"([.][A-Z][A-Z])[^.]*$")

Filtering down by only those that contain.lower() "type":

>>> pprint([obj.name for obj in inv.objects if obj.role == "data" and pat2.search(obj.name) and "type" in obj.name.lower()])

['errno.EPROTOTYPE',
 'locale.LC_CTYPE',
 'sqlite3.PARSE_DECLTYPES',
 'ssl.CHANNEL_BINDING_TYPES',
 'token.TYPE_COMMENT',
 'token.TYPE_IGNORE',
 'typing.TYPE_CHECKING',
 'xml.parsers.expat.XMLParserType']

Of these, only 'xml.parsers.expat.XMLParserType' seems to me a likely problem entry.
msg403318 - (view) Author: gaborjbernat (gaborjbernat) * Date: 2021-10-06 15:24
I think Brian Skinn script is a rough approximation, but definitely entirely accurate. You'd have to match up what sphinx thinks per doc vs what you import for an accurate view.
msg403343 - (view) Author: gaborjbernat (gaborjbernat) * Date: 2021-10-07 00:54
Here's a gist where I managed to detect roughly 140 errors (some looks like potential false positive, so likely the real number is more around 100):

https://gist.github.com/gaborbernat/5360badab2125b3f81a3bcbce0e94c2a#file-found_issues-output-L1

This does make a few concessions:
- ignores the difference between function and method; way to many functions are documented as methods and vice-versa to disallow this (or would be a major overhaul)
- https://docs.python.org/3/c-api/structures.html?highlight=meth_class#METH_VARARGS is documented under python domain but IMHO should be C
- https://docs.python.org/3/c-api/typeobj.html?highlight=py_tpflags_base_exc_subclass#c.PyTypeObject.tp_flags is documented under python domain but IMHO should be C
- does not clarifies where to type classes goes - they seem to be a weird in-between a method and a class, satisfying neither - see related discussion on topic from https://bugs.python.org/issue41973
msg408115 - (view) Author: Ken Jin (kj) * (Python committer) Date: 2021-12-09 12:56
New changeset e2cfc89e099b8fad5d8d5bd7f59dadffb6078778 by Bernát Gábor in branch 'main':
bpo-45391: mark UnionType as a class in documentation (GH-28757)
https://github.com/python/cpython/commit/e2cfc89e099b8fad5d8d5bd7f59dadffb6078778
msg408119 - (view) Author: miss-islington (miss-islington) Date: 2021-12-09 13:17
New changeset 2c2ee83c4db4dbd54017dc364bbefc70fa75ea5d by Miss Islington (bot) in branch '3.10':
bpo-45391: mark UnionType as a class in documentation (GH-28757)
https://github.com/python/cpython/commit/2c2ee83c4db4dbd54017dc364bbefc70fa75ea5d
msg408120 - (view) Author: Ken Jin (kj) * (Python committer) Date: 2021-12-09 13:22
As a start, I merged the types.UnionType fix because it's straightforward, but the rest are a little dubious so I'll leave this issue open for now.
History
Date User Action Args
2022-04-11 14:59:50adminsetgithub: 89554
2021-12-09 13:22:02kjsetmessages: + msg408120
title: 3.10 objects.inv classifies UnionType as data -> 3.10 objects.inv classifies many types as data
2021-12-09 13:17:44miss-islingtonsetmessages: + msg408119
2021-12-09 12:56:40miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request28228
2021-12-09 12:56:24kjsetnosy: + kj
messages: + msg408115
2021-10-23 06:23:29dom1310dfsetnosy: + dom1310df
2021-10-07 00:54:33gaborjbernatsetmessages: + msg403343
2021-10-06 15:24:54gaborjbernatsetmessages: + msg403318
2021-10-06 15:17:42bskinnsetmessages: + msg403313
2021-10-06 15:13:37bskinnsetnosy: + bskinn
messages: + msg403311
2021-10-06 15:08:42gaborjbernatsetmessages: + msg403310
2021-10-06 15:03:17lukasz.langasetnosy: + lukasz.langa
messages: + msg403308
2021-10-06 12:45:43gaborjbernatsetmessages: + msg403301
2021-10-06 12:38:19gaborjbernatsetkeywords: + patch
stage: patch review
pull_requests: + pull_request27100
2021-10-06 12:20:50gaborjbernatcreate