This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: RegEx for numbers in documentation (easy fix - solution provided)
Type: behavior Stage: resolved
Components: Regular Expressions Versions: Python 3.9, Python 3.8, Python 3.7, Python 3.6, Python 3.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: TheBrandonGuy, ezio.melotti, mark.dickinson, mrabarnett, serhiy.storchaka
Priority: normal Keywords:

Created on 2020-04-19 20:01 by TheBrandonGuy, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (3)
msg366801 - (view) Author: Brandon (TheBrandonGuy) Date: 2020-04-19 20:01
The regular expression used for matching numbers in the documentation for the regular expressions module (the tokenizer section) doesn't match the string ".5", but does match the string "3.".

Here's a link to the tokenizer section of the documentation: https://docs.python.org/3/library/re.html#writing-a-tokenizer

The tokenizer example uses r'\d+(\.\d*)?' for matching numbers. I would personally match ".5" as a number before I would match "3." as a number. In order to do this, I would use r'(\d*\.)?\d+' instead of r'\d+(\.\d*)?'. Python 3's interpreter matches both "3." and ".5" as numbers when interpreting code, so you could use a different regex example for matching both if you wanted to be consistent with Python's own interpreter.
msg366814 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2020-04-20 08:49
As you say, isn't this just a personal preference?

Is there an objective reason to prefer something that accepts ".5" and rejects "3." over something that rejects ".5" and accepts "3."?

The exact form of numbers accepted seems to be to be irrelevant to the point of the example; I'm not sure I see much value in changing it.
msg366816 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-20 09:59
I concur with Mark. It is just an example. The regular expression which would support all possible forms of numbers (including exponent, underscores, non-decimal numbers) would be more complex and would distract from the main goal of the example.
History
Date User Action Args
2022-04-11 14:59:29adminsetgithub: 84512
2020-04-20 09:59:36serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg366816

resolution: not a bug
stage: resolved
2020-04-20 08:49:52mark.dickinsonsetnosy: + mark.dickinson
messages: + msg366814
2020-04-19 20:01:23TheBrandonGuycreate