This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: token type constants are not documented
Type: Stage:
Components: Documentation Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, eric.araujo, georg.brandl, isandler
Priority: normal Keywords: patch

Created on 2010-06-11 00:27 by isandler, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
token.rst.patch isandler, 2010-06-12 05:31 documentation patch review
token.rst.patch.v2 isandler, 2010-06-13 01:22
Messages (6)
msg107509 - (view) Author: Ilya Sandler (isandler) Date: 2010-06-11 00:27
the token module defines constants for token types e.g

 NAME = 1
 NUMBER = 2
 STRING = 3
 NEWLINE = 4

etc.

These constants are very useful for any code which needs to tokenize python source, yet they are not listed in the documentation.


Is this a documentation bug?
msg107527 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-06-11 04:59
It is. Constants should be documented in the regular doc, not only by introspection or reading the source. See the doc for tokenize: “All constants from the token module are also exported from tokenize, as are two additional token type values”.

Thanks for the report. Would you mind writing a patch?
msg107629 - (view) Author: Ilya Sandler (isandler) Date: 2010-06-12 05:31
I'm attaching a documentation patch.

Do note that there is also a bit of code-level inconsistency: a few tokens (COMMENT, NL) are defined in tokenize module which is strange and inconvenient.

Should that be fixed too? (by moving token definitions into token module)
msg107632 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-06-12 06:38
Five comments:

* I would list them all in one directive, like this:

.. data:: FOO
          BAR
          BAZ

  which makes the display more compact.

* There is no description in that directive.  Best move part of the
  description above them in it.

* NL and COMMENT are defined in tokenize because they are neither 
  defined nor used by the Python tokenizer.
msg107719 - (view) Author: Ilya Sandler (isandler) Date: 2010-06-13 01:22
> * I would list them all in one directive, like this:

Ok, done. Attaching the updated patch


> There is no description in that directive.  Best move part of the
description above them in it.

I am not sure I understand your request. Could you clarify?


>  NL and COMMENT are defined in tokenize because they are neither 
defined nor used by the Python tokenizer.

Oh, I just realized the source of my misunderstanding: token.py captures token types as they are returned by python's own tokenizer. While tokenize module  does its own tokenization and produces results which are a bit different. COMMENT and NL tokens is one of the differences but there is a difference in how the operation tokens are treated. In fact tokenize's docstring explicitly says so.

... It is designed to match the working of the Python tokenizer exactly, except that it produces COMMENT tokens for comments and gives type OP for all operators ...

I think this clarification should go into official docs as well, would you agree?

PS. even with this clarification, I find the situation quite a bit confusing: especially given that tok_name dict from token module does have a name for COMMENT and that name is inserted by tokenize module..  So I still feel that just adding COMMENT and NL to token module (with clarification that they are not generated by the python's own tokenizer) would be a bit cleaner.
msg118921 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-10-17 09:46
Committed in r85614.
History
Date User Action Args
2022-04-11 14:57:02adminsetgithub: 53214
2010-10-17 09:46:23georg.brandlsetstatus: open -> closed
resolution: fixed
messages: + msg118921
2010-06-13 01:22:43isandlersetfiles: + token.rst.patch.v2

messages: + msg107719
2010-06-12 06:38:47georg.brandlsetnosy: + georg.brandl
messages: + msg107632
2010-06-12 05:31:02isandlersetfiles: + token.rst.patch
keywords: + patch
messages: + msg107629
2010-06-11 04:59:45eric.araujosetnosy: + eric.araujo

messages: + msg107527
versions: + Python 2.6, Python 3.1, Python 2.7, Python 3.2, - Python 3.3
2010-06-11 00:27:42isandlercreate