Message 221158 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	taleinat
Recipients	ezio.melotti, lemburg, loewis, taleinat, terry.reedy
Date	2014-06-21.07:15:48
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1403334949.22.0.404692853804.issue21765@psf.upfronthosting.co.za>
In-reply-to

Content
Alright, so I'm going to use the equivalent of the following code, unless someone can tell me that something is wrong: from keyword import iskeyword from unicodedata import category, normalize _ID_FIRST_CATEGORIES = {"Lu", "Ll", "Lt", "Lm", "Lo", "Nl", "Other_ID_Start"} _ID_CATEGORIES = _ID_FIRST_CATEGORIES \| {"Mn", "Mc", "Nd", "Pc", "Other_ID_Continue"} _ASCII_ID_CHARS = set(string.ascii_letters + string.digits + "_") _ID_KEYWORDS = {"True", "False", "None"} def is_id_char(char): return char in _ASCII_ID_CHARS or ( ord(char) >= 128 and category(normalize(char)[0]) in _ID_CATEGORIES ) def is_identifier(id_candidate): return id_candidate.isidentifier() and ( (not iskeyword(id_candidate)) or id_candidate in _ID_KEYWORDS ) def _eat_identifier(str, limit, pos): i = pos while i > limit and is_id_char(str[pos - i]): i -= 1 if i < pos and not is_identifier(str[i:pos]): return 0 return pos - i

Alright, so I'm going to use the equivalent of the following code, unless someone can tell me that something is wrong:


from keyword import iskeyword
from unicodedata import category, normalize

_ID_FIRST_CATEGORIES = {"Lu", "Ll", "Lt", "Lm", "Lo", "Nl",
                        "Other_ID_Start"}
_ID_CATEGORIES = _ID_FIRST_CATEGORIES | {"Mn", "Mc", "Nd", "Pc",
                                         "Other_ID_Continue"}

_ASCII_ID_CHARS = set(string.ascii_letters + string.digits + "_")
_ID_KEYWORDS = {"True", "False", "None"}

def is_id_char(char):
    return char in _ASCII_ID_CHARS or (
        ord(char) >= 128 and
        category(normalize(char)[0]) in _ID_CATEGORIES
    )

def is_identifier(id_candidate):
    return id_candidate.isidentifier() and (
        (not iskeyword(id_candidate)) or
        id_candidate in _ID_KEYWORDS
    )

 def _eat_identifier(str, limit, pos):
    i = pos
    while i > limit and is_id_char(str[pos - i]):
        i -= 1
    if i < pos and not is_identifier(str[i:pos]):
        return 0
    return pos - i

History
Date	User	Action	Args
2014-06-21 07:15:49	taleinat	set	recipients: + taleinat, lemburg, loewis, terry.reedy, ezio.melotti
2014-06-21 07:15:49	taleinat	set	messageid: <1403334949.22.0.404692853804.issue21765@psf.upfronthosting.co.za>
2014-06-21 07:15:49	taleinat	link	issue21765 messages
2014-06-21 07:15:48	taleinat	create