Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong keyword parameter name in regex pattern methods #64482

Closed
serhiy-storchaka opened this issue Jan 16, 2014 · 26 comments
Closed

Wrong keyword parameter name in regex pattern methods #64482

serhiy-storchaka opened this issue Jan 16, 2014 · 26 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member

BPO 20283
Nosy @loewis, @birkenfeld, @terryjreedy, @pitrou, @vstinner, @taleinat, @larryhastings, @benjaminp, @ezio-melotti, @serhiy-storchaka
Files
  • sre_pattern_string_keyword.patch: The "hard" patch
  • sre_deprecate_pattern_keyword.patch: The "soft" patch
  • sre_deprecate_pattern_keyword-3.4.patch
  • sre_deprecate_pattern_keyword-3.4_2.patch
  • test_re_keyword_parameters.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2014-03-06.11:32:19.090>
    created_at = <Date 2014-01-16.20:44:27.414>
    labels = ['type-bug', 'library']
    title = 'Wrong keyword parameter name in regex pattern methods'
    updated_at = <Date 2014-03-06.11:32:19.090>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2014-03-06.11:32:19.090>
    actor = 'Arfrever'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2014-03-06.11:32:19.090>
    closer = 'Arfrever'
    components = ['Library (Lib)']
    creation = <Date 2014-01-16.20:44:27.414>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['33598', '33721', '34216', '34280', '34290']
    hgrepos = []
    issue_num = 20283
    keywords = ['patch']
    message_count = 26.0
    messages = ['208311', '208375', '208689', '208743', '209229', '209264', '209289', '209291', '209302', '209303', '209304', '209305', '210676', '210734', '210735', '212140', '212619', '212672', '212674', '212677', '212712', '212752', '212753', '212769', '212800', '212803']
    nosy_count = 13.0
    nosy_names = ['loewis', 'georg.brandl', 'terry.reedy', 'pitrou', 'vstinner', 'taleinat', 'larry', 'benjamin.peterson', 'ezio.melotti', 'mrabarnett', 'Arfrever', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue20283'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4']

    @serhiy-storchaka
    Copy link
    Member Author

    Documented (in docstring and in ReST documentation) signatures of the match, search and (since 3.4) fullmatch methods of regex pattern object are:

    match(string[, pos[, endpos]])
    search(string[, pos[, endpos]])
    fullmatch(string[, pos[, endpos]])

    However in implementation the first keyword argument by mistake named "pattern". This looks as nonsense. The pattern is object itself, and first argument is a string. First arguments in other methods (split, findall, etc) named "string", and module-level functions have both "pattern" and "string" parameters:

    match(pattern, string, flags=0)
    search(pattern, string, flags=0)

    I think we should fix this mistake. The "pattern" name is obviously wrong and is not match the documentation.

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jan 16, 2014
    @terryjreedy
    Copy link
    Member

    How nasty. I agree that this is a code bug. Unfortunately in this case, the C code does keyword matching of arguments and 'corrects' the doc for anyone who tries 'string='.

    >>> pat.search(string='xabc', pos=1)
    Traceback (most recent call last):
      File "<pyshell#6>", line 1, in <module>
        pat.search(string='xabc', pos=1)
    TypeError: Required argument 'pattern' (pos 1) not found
    >>> pat.search(pattern='xabc', pos=1)
    <_sre.SRE_Match object; span=(1, 4), match='abc'>

    I think we should only change this in 3.4 (and should do so in 3.4).

    @serhiy-storchaka
    Copy link
    Member Author

    Actually, several other methods also have wrong parameter name, "source" instead of "string".

    @terryjreedy
    Copy link
    Member

    If no one else pipes up here, perhaps ask on pydef about changing C names to match documented names.

    @serhiy-storchaka
    Copy link
    Member Author

    Here is patch for 3.3 which adds alternative parameter name. Now both keyword names are allowed, but deprecation warning is emitted if old keyword name is used.

    >>> import re
    >>> p = re.compile('')
    >>> p.match()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: Required argument 'string' (pos 1) not found
    >>> p.match('')
    <_sre.SRE_Match object at 0xb705c598>
    >>> p.match(string='')
    <_sre.SRE_Match object at 0xb705c720>
    >>> p.match(pattern='')
    __main__:1: DeprecationWarning: The 'pattern' keyword parameter name is deprecated.  Use 'string' instead.
    <_sre.SRE_Match object at 0xb705c758>
    >>> p.match('', string='')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: Argument given by name ('string') and position (1)
    >>> p.match('', pattern='')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: Argument given by name ('pattern') and position (1)

    @terryjreedy
    Copy link
    Member

    Great. Old and new both in at least one release, when possible, is best. I should have thought of asking if that would be possible. In this case, I think the (undocumented) old should disappear in 3.5.

    Since the mistaken 'pattern' name is not documented now, I would not add anything to the doc.

    I would augment the the warning
    "The 'pattern' keyword parameter name is deprecated."
    to briefly explain the deprecation and its timing by saying
    "The erroneous and undocumented 'pattern' keyword parameter name is deprecated and will be removed in version 3.5."

    The patch did not upload correctly. I just see "Modules/_sre.c | 64 �[32m+++++++++++++++++++++++++++++++++++++++�[0;39m�[36m!!!!!!!!!!!!!!!!!!�[0;39m
    1 file changed, 44 insertions(+), 20 modifications(!)" when I open it in a new Firefox tab.

    @serhiy-storchaka
    Copy link
    Member Author

    The patch did not upload correctly.

    Oh, sorry. Here is correct patch.

    I propose to apply "soft" patch (which preserves support for old keyword parameter name) to 2.7 and 3.3, and apply "hard" patch (which just renames keyword parameter name) to 3.4.

    Or we can just apply "hard" patch (it's much simpler) to all versions.

    @birkenfeld
    Copy link
    Member

    For 3.3 I prefer the "soft" patch.

    @larryhastings
    Copy link
    Contributor

    Georg: you're accepting this patch into 3.3? I'm surprised.

    I would only want the "soft" approach. But I haven't said "yes" yet. I want to discuss it a little more. (Hey, it's python core dev. Discussing things endlessly is our job.)

    @serhiy-storchaka
    Copy link
    Member Author

    If you want the "soft" approach, then you should revert your changes to
    _sre.SRE_Pattern.match.

    @larryhastings
    Copy link
    Contributor

    You can do it, if I accept the patch for 3.4. There's no point in doing it in two stages.

    @larryhastings
    Copy link
    Contributor

    Alternatively, we could use this cheap hack:

    /*[python input]
    class hidden_object_converter(object_converter):
    show_in_signature = False

    [python start generated code]*/

    /*[clinic input]
    module _sre
    class _sre.SRE_Pattern "PatternObject *" "&Pattern_Type"

    _sre.SRE_Pattern.match as pattern_match

    string: object
    pos: Py_ssize_t = 0
    endpos: Py_ssize_t(c_default="PY_SSIZE_T_MAX") = sys.maxsize
    pattern: hidden_object = None
    

    ...

    @serhiy-storchaka
    Copy link
    Member Author

    Larry, so what is your decision?

    1. Apply the "hard" patch and then convert Modules/_sre.c to use Argument Clinic (bpo-20148).

    2. Revert converted match() method, apply the "soft" patch, and delay applying of the "hard" patch and then converting to use Argument Clinic to 3.5. Applying the "soft" patch and then the "hard" patch will cause more source churn than just applying the "hard" patch.

    3. Use show_in_signature hack. I don't like this, it looks ugly and adds too much source churn.

    @larryhastings
    Copy link
    Contributor

    Use #3.

    @larryhastings
    Copy link
    Contributor

    "pattern" should be keyword-only, and if used the function should generate a DeprecationWarning.

    @serhiy-storchaka
    Copy link
    Member Author

    Here is a patch with the show_in_signature hack for 3.4.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 3, 2014

    The patch sre_deprecate_pattern_keyword-3.4.patch looks good to me. I *think* that Larry has pre-approved it for 3.4.

    If it is applied, and if people still think that 2.7 and 3.3 need to be changed, the release-critical status should be removed from the issue.

    @serhiy-storchaka
    Copy link
    Member Author

    The disadvantage of sre_deprecate_pattern_keyword-3.4.patch is that it creates
    false signature for SRE_Pattern.match(). Default value of first argument is
    exposed as None, but actually this parameter is mandatory and None is not
    valid value for it. I afraid the only way to get rid of false signature (and
    keep backward compatibility) is to revert converting to Argument Clinic. And
    here is a patch which do this.

    @larryhastings
    Copy link
    Contributor

    Why can't you remove the "= NULL" from the Clinic input for "string"?

    @serhiy-storchaka
    Copy link
    Member Author

    Why can't you remove the "= NULL" from the Clinic input for "string"?

    Because this will prohibit the use of "pattern" as keyword argument.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 4, 2014

    We are close to Python 3.4 final, so what is the status of this issue? I don't see any commit and nothing to cherry-pick in Larry's 3.4.0 repository.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 4, 2014

    Since there is no consensus on how to resolve this issue, I'm dropping the release-critical status for it; people should now consider whether a future agreed-upon solution could apply to 3.4.1 or just to 3.5.

    @loewis loewis mannequin removed the release-blocker label Mar 4, 2014
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 4, 2014

    Serhiy: the patch is incomplete; it lacks test cases.

    @serhiy-storchaka
    Copy link
    Member Author

    Here is a test.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 6, 2014

    New changeset 52743dc788e6 by Serhiy Storchaka in branch '3.3':
    Issue bpo-20283: RE pattern methods now accept the string keyword parameters
    http://hg.python.org/cpython/rev/52743dc788e6

    New changeset f4d7abcf8080 by Serhiy Storchaka in branch 'default':
    Issue bpo-20283: RE pattern methods now accept the string keyword parameters
    http://hg.python.org/cpython/rev/f4d7abcf8080

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 6, 2014

    New changeset 52256a5861fa by Serhiy Storchaka in branch '2.7':
    Issue bpo-20283: RE pattern methods now accept the string keyword parameters
    http://hg.python.org/cpython/rev/52256a5861fa

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants