classification
Title: Wrong keyword parameter name in regex pattern methods
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.4, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Arfrever, benjamin.peterson, ezio.melotti, georg.brandl, haypo, larry, loewis, mrabarnett, pitrou, python-dev, serhiy.storchaka, taleinat, terry.reedy
Priority: normal Keywords: patch

Created on 2014-01-16 20:44 by serhiy.storchaka, last changed 2014-03-06 11:32 by Arfrever. This issue is now closed.

Files
File name Uploaded Description Edit
sre_pattern_string_keyword.patch serhiy.storchaka, 2014-01-21 19:23 The "hard" patch review
sre_deprecate_pattern_keyword.patch serhiy.storchaka, 2014-01-26 08:19 The "soft" patch review
sre_deprecate_pattern_keyword-3.4.patch serhiy.storchaka, 2014-02-24 20:52 review
sre_deprecate_pattern_keyword-3.4_2.patch serhiy.storchaka, 2014-03-03 21:20 review
test_re_keyword_parameters.patch serhiy.storchaka, 2014-03-05 20:45 review
Messages (26)
msg208311 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-16 20:44
Documented (in docstring and in ReST documentation) signatures of the match, search and (since 3.4) fullmatch methods of regex pattern object are:

match(string[, pos[, endpos]])
search(string[, pos[, endpos]])
fullmatch(string[, pos[, endpos]])

However in implementation the first keyword argument by mistake named "pattern". This looks as nonsense. The pattern is object itself, and first argument is a string. First arguments in other methods (split, findall, etc) named "string", and module-level functions have both "pattern" and "string" parameters:

match(pattern, string, flags=0)
search(pattern, string, flags=0)

I think we should fix this mistake. The "pattern" name is obviously wrong and is not match the documentation.
msg208375 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-01-18 00:17
How nasty. I agree that this is a code bug. Unfortunately in this case, the C code does keyword matching of arguments and 'corrects' the doc for anyone who tries 'string='.

>>> pat.search(string='xabc', pos=1)
Traceback (most recent call last):
  File "<pyshell#6>", line 1, in <module>
    pat.search(string='xabc', pos=1)
TypeError: Required argument 'pattern' (pos 1) not found
>>> pat.search(pattern='xabc', pos=1)
<_sre.SRE_Match object; span=(1, 4), match='abc'>

I think we should only change this in 3.4 (and should do so in 3.4).
msg208689 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-21 19:23
Actually, several other methods also have wrong parameter name, "source" instead of "string".
msg208743 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-01-22 04:09
If no one else pipes up here, perhaps ask on pydef about changing C names to match documented names.
msg209229 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-25 19:19
Here is patch for 3.3 which adds alternative parameter name. Now both keyword names are allowed, but deprecation warning is emitted if old keyword name is used.

>>> import re
>>> p = re.compile('')
>>> p.match()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Required argument 'string' (pos 1) not found
>>> p.match('')
<_sre.SRE_Match object at 0xb705c598>
>>> p.match(string='')
<_sre.SRE_Match object at 0xb705c720>
>>> p.match(pattern='')
__main__:1: DeprecationWarning: The 'pattern' keyword parameter name is deprecated.  Use 'string' instead.
<_sre.SRE_Match object at 0xb705c758>
>>> p.match('', string='')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Argument given by name ('string') and position (1)
>>> p.match('', pattern='')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Argument given by name ('pattern') and position (1)
msg209264 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-01-26 01:13
Great. Old and new both in at least one release, when possible, is best. I should have thought of asking if that would be possible. In this case, I think the (undocumented) old should disappear in 3.5.

Since the mistaken 'pattern' name is not documented now, I would not add anything to the doc.

I would augment the the warning
 "The 'pattern' keyword parameter name is deprecated."
to briefly explain the deprecation and its timing by saying
 "The erroneous and undocumented 'pattern' keyword parameter name is deprecated and will be removed in version 3.5."

The patch did not upload correctly. I just see "Modules/_sre.c |   64 +++++++++++++++++++++++++++++++++++++++!!!!!!!!!!!!!!!!!!
  1 file changed, 44 insertions(+), 20 modifications(!)" when I open it in a new Firefox tab.
msg209289 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-26 08:19
> The patch did not upload correctly.

Oh, sorry. Here is correct patch.

I propose to apply "soft" patch (which preserves support for old keyword parameter name) to 2.7 and 3.3, and apply "hard" patch (which just renames keyword parameter name) to 3.4.

Or we can just apply "hard" patch (it's much simpler) to all versions.
msg209291 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2014-01-26 08:33
For 3.3 I prefer the "soft" patch.
msg209302 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-01-26 12:39
Georg: you're accepting this patch into 3.3?  I'm surprised.

I would only want the "soft" approach.  But I haven't said "yes" yet.  I want to discuss it a little more.  (Hey, it's python core dev.  Discussing things endlessly is our job.)
msg209303 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-26 12:46
If you want the "soft" approach, then you should revert your changes to 
_sre.SRE_Pattern.match.
msg209304 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-01-26 12:48
You can do it, if I accept the patch for 3.4.  There's no point in doing it in two stages.
msg209305 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-01-26 12:56
Alternatively, we could use this cheap hack:

/*[python input]
class hidden_object_converter(object_converter):
    show_in_signature = False

[python start generated code]*/

/*[clinic input]
module _sre
class _sre.SRE_Pattern "PatternObject *" "&Pattern_Type"

_sre.SRE_Pattern.match as pattern_match

    string: object
    pos: Py_ssize_t = 0
    endpos: Py_ssize_t(c_default="PY_SSIZE_T_MAX") = sys.maxsize
    pattern: hidden_object = None

...
msg210676 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-02-08 19:27
Larry, so what is your decision?

1. Apply the "hard" patch and then convert Modules/_sre.c to use Argument Clinic (issue20148).

2. Revert converted match() method, apply the "soft" patch, and delay applying of the "hard" patch and then converting to use Argument Clinic to 3.5. Applying the "soft" patch and then the "hard" patch will cause more source churn than just applying the "hard" patch.

3. Use show_in_signature hack. I don't like this, it looks ugly and adds too much source churn.
msg210734 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-09 09:34
Use #3.
msg210735 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-09 09:35
"pattern" should be keyword-only, and if used the function should generate a DeprecationWarning.
msg212140 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-02-24 20:52
Here is a patch with the show_in_signature hack for 3.4.
msg212619 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-03-03 08:39
The patch sre_deprecate_pattern_keyword-3.4.patch looks good to me. I *think* that Larry has pre-approved it for 3.4.

If it is applied, and if people still think that 2.7 and 3.3 need to be changed, the release-critical status should be removed from the issue.
msg212672 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-03-03 21:20
The disadvantage of sre_deprecate_pattern_keyword-3.4.patch is that it creates 
false signature for SRE_Pattern.match(). Default value of first argument is 
exposed as None, but actually this parameter is mandatory and None is not 
valid value for it. I afraid the only way to get rid of false signature (and 
keep backward compatibility) is to revert converting to Argument Clinic.  And 
here is a patch which do this.
msg212674 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-03-03 21:30
Why can't you remove the "= NULL" from the Clinic input for "string"?
msg212677 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-03-03 21:48
> Why can't you remove the "= NULL" from the Clinic input for "string"?

Because this will prohibit the use of "pattern" as keyword argument.
msg212712 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-03-04 10:46
We are close to Python 3.4 final, so what is the status of this issue? I don't see any commit and nothing to cherry-pick in Larry's 3.4.0 repository.
msg212752 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-03-04 23:39
Since there is no consensus on how to resolve this issue, I'm dropping the release-critical status for it; people should now consider whether a future agreed-upon solution could apply to 3.4.1 or just to 3.5.
msg212753 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-03-04 23:43
Serhiy: the patch is incomplete; it lacks test cases.
msg212769 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-03-05 20:45
Here is a test.
msg212800 - (view) Author: Roundup Robot (python-dev) Date: 2014-03-06 09:37
New changeset 52743dc788e6 by Serhiy Storchaka in branch '3.3':
Issue #20283: RE pattern methods now accept the string keyword parameters
http://hg.python.org/cpython/rev/52743dc788e6

New changeset f4d7abcf8080 by Serhiy Storchaka in branch 'default':
Issue #20283: RE pattern methods now accept the string keyword parameters
http://hg.python.org/cpython/rev/f4d7abcf8080
msg212803 - (view) Author: Roundup Robot (python-dev) Date: 2014-03-06 10:25
New changeset 52256a5861fa by Serhiy Storchaka in branch '2.7':
Issue #20283: RE pattern methods now accept the string keyword parameters
http://hg.python.org/cpython/rev/52256a5861fa
History
Date User Action Args
2014-03-06 11:32:19Arfreversetstatus: open -> closed
2014-03-06 10:41:47serhiy.storchakasetassignee: serhiy.storchaka
resolution: fixed
stage: patch review -> resolved
2014-03-06 10:25:55python-devsetmessages: + msg212803
2014-03-06 09:37:27python-devsetnosy: + python-dev
messages: + msg212800
2014-03-05 20:45:00serhiy.storchakasetfiles: + test_re_keyword_parameters.patch

messages: + msg212769
2014-03-04 23:43:32loewissetmessages: + msg212753
2014-03-04 23:39:34loewissetpriority: release blocker -> normal

messages: + msg212752
2014-03-04 10:46:41hayposetnosy: + haypo
messages: + msg212712
2014-03-03 21:48:16serhiy.storchakasetmessages: + msg212677
2014-03-03 21:30:39larrysetmessages: + msg212674
2014-03-03 21:20:56serhiy.storchakasetfiles: + sre_deprecate_pattern_keyword-3.4_2.patch

messages: + msg212672
2014-03-03 09:29:21Arfreversetnosy: + Arfrever
2014-03-03 08:39:18loewissetnosy: + loewis
messages: + msg212619
2014-02-24 20:52:41serhiy.storchakasetpriority: normal -> release blocker
files: + sre_deprecate_pattern_keyword-3.4.patch
messages: + msg212140
2014-02-09 09:35:31larrysetmessages: + msg210735
2014-02-09 09:34:46larrysetmessages: + msg210734
2014-02-08 19:28:08serhiy.storchakalinkissue20148 dependencies
2014-02-08 19:27:22serhiy.storchakasetmessages: + msg210676
2014-01-26 12:56:18larrysetmessages: + msg209305
2014-01-26 12:48:32larrysetmessages: + msg209304
2014-01-26 12:46:48serhiy.storchakasetmessages: + msg209303
2014-01-26 12:39:59larrysetmessages: + msg209302
2014-01-26 08:33:12georg.brandlsetmessages: + msg209291
2014-01-26 08:31:04terry.reedysetfiles: - sre_deprecate_pattern_keyword.patch
2014-01-26 08:19:13serhiy.storchakasetfiles: + sre_deprecate_pattern_keyword.patch
nosy: + georg.brandl, larry, benjamin.peterson
messages: + msg209289

2014-01-26 01:13:37terry.reedysetmessages: + msg209264
2014-01-25 19:19:35serhiy.storchakasetfiles: + sre_deprecate_pattern_keyword.patch

messages: + msg209229
2014-01-23 17:36:53taleinatsetnosy: + taleinat
2014-01-22 04:09:19terry.reedysetmessages: + msg208743
2014-01-21 19:24:51serhiy.storchakasetnosy: + mrabarnett
2014-01-21 19:24:08serhiy.storchakasetfiles: - sre_pattern_string_keyword.patch
2014-01-21 19:23:41serhiy.storchakasetfiles: + sre_pattern_string_keyword.patch

messages: + msg208689
stage: needs patch -> patch review
2014-01-18 00:17:28terry.reedysetnosy: + terry.reedy
messages: + msg208375
2014-01-17 10:11:21serhiy.storchakasetfiles: + sre_pattern_string_keyword.patch
2014-01-17 10:09:08serhiy.storchakasetfiles: - sre_pattern_string_keyword.patch
2014-01-17 10:08:05serhiy.storchakasetfiles: + sre_pattern_string_keyword.patch
keywords: + patch
2014-01-16 20:44:27serhiy.storchakacreate