classification
Title: make bytes/bytearray translate's delete a keyword argument
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: martin.panter Nosy List: martin.panter, python-dev, r.david.murray, serhiy.storchaka, xiang.zhang
Priority: normal Keywords: patch

Created on 2016-07-13 09:48 by xiang.zhang, last changed 2016-08-27 16:40 by xiang.zhang. This issue is now closed.

Files
File name Uploaded Description Edit
bytes_translate_delete_as_keyword_arguments.patch xiang.zhang, 2016-07-13 09:48 review
bytes_translate_delete_as_keyword_arguments_v2.patch xiang.zhang, 2016-07-13 15:38 review
table-optional-delete-empty.patch martin.panter, 2016-07-23 13:13 review
bytes_translate_delete_as_keyword_arguments_v3.patch xiang.zhang, 2016-08-18 06:15 review
Messages (18)
msg270303 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-13 09:48
Write a patch to make bytes/bytearray.translate's delete argument support acting as keyword arguments. This won't break any backwards compatibility and make the method more flexible to use. Besides, in the C code level, it stops using argument clinic's legacy optional group feature and removes the unnecessary group_right_1 parameter.
msg270310 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-13 14:25
Hmm, David, that may be not quite right. Users only reading the doc never know it's deletechars not delete. The doc is always delete, though conflicting with __doc__.

>>> print(bytes.translate.__doc__)
B.translate(table[, deletechars]) -> bytes
...

I deliberately change deletechars to delete to keep consistent with doc. But actually I think using deletechars won't break backwards compatibility too.
msg270317 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-07-13 15:31
Ah, I was looking at the 2.7 docs.
msg270318 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-13 15:38
Please review the new version. It makes two changes comparing with the last one.

1. It exposes Python parameter as "delete" (which the document always uses so I think it's the API) while still use "deletechars" (which I prefer as a C variable name) in C code.

2. It allows *delete* to be None. Before this is not allowed but I don't think this change breaks backwards compatibility. The reason for this change is that I don't want users to get surprised when they pass the default value to translate but then get an exception.
msg270349 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-13 22:49
Instead of allowing delete=None (which is not in the RST documentation), perhaps it is possible to change the doc string. I can’t remember the details, but I think Argument Clinic allows a virtual Python-level default, something like “object(py_default=b"") = NULL”.

Also, I think I like the change. What do you think about making the first argument optional (default to None), allowing calls like x.translate(delete=b'aeiou')?
msg270354 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-14 02:52
Thanks for your comment Martin. I'll apply them later when we reach agreement on functions. 

I have already used object = NULL, the C default is not necessary here, and it works as you like I think. In patch version 1, b'abc'.translate(None, None) raises exception as before. I change it in patch version 2 because argument clinic generates function signature as "($self, table, /, delete=None)". So I don't want users get surprised when they provide None as the signature but get an exception. And using None as a placeholder for a keyword argument is normal in Python. But I'm OK to keep the previous behaviour and actually I prefer that.

As for making the first argument optional, I don't quite like that since the doc seems to encourage users to set None explicitly.
msg271074 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-23 13:13
This patch is what I had in mind for setting the documented default as delete=b'', but using NULL internally.

I also changed it to allow the table argument to be omitted. We can change the documentation accordingly. These are just suggestions; use either or both aspects as you please :)
msg271090 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-23 15:52
LGTM. Using b'' instead of the None as the default value of *delete* looks better since it doesn't break backwards compatibility. As for the first argument optional or not, actually it's both okay. You have changed the doc accordingly.
msg272434 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-08-11 10:16
Serhiy, you assigned this to yourself. What do you think of my patch?
msg272482 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-11 20:49
PyArg_ParseTupleAndKeywords can be slower than PyArg_ParseTuple even for positional arguments. We need benchmarking results (especially after committing a patch for issue27574).

What is the purpose of adding support of the delete argument as keyword arguments? It looks to me, that the only purpose is allowing to specify the delete argument without specifying the table argument. There are two alternative ways to achieve this: make translate() accepting some special value (e.g. None) as the default value for the first argument:

    b'hello'.translate(None, b'l')

or make translate() accepting the delete argument as keyword argument:

    b'hello'.translate(delete=b'l')

The patch does both things, but only one is needed. If add the support of the delete argument as keyword argument, I would prefer to not add the support of None as the first argument, but would specify its default value as bytes(range(256)):

    table: object(c_default="NULL") = bytes(range(256))
    /
    delete as deletechars: object(c_default="NULL") = b''

I don't know why optional group was used here, the function could be implemented without it.
msg272486 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-08-11 23:48
I agree it would be worth checking for a slowdown.

As well as giving the option of omitting the table argument, it would make call sites easier to read. It would avoid suggesting that the first argument is translated to the second, like maketrans().

data = data.translate(YENC_TABLE, delete=b"\r\n")

Translate() already accepts None as the first argument; this is not new:

>>> b"hello".translate(None, b"l")
b'heo'

I guess the optional group was used as a way of making the second argument optional without a specific default value.
msg272497 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-12 04:36
So let's do a simple benchmark.

# without patch

./python -m timeit -s 'string=bytes(range(256));table=bytes(range(255, -1, -1));delete=b"abcdefghijklmn"' 'string.translate(table, delete)'
1000000 loops, best of 3: 0.55 usec per loop

# with patch

./python -m timeit -s 'string=bytes(range(256));table=bytes(range(255, -1, -1));delete=b"abcdefghijklmn"' 'string.translate(table, delete)'
1000000 loops, best of 3: 0.557 usec per loop

# keyword specified

./python -m timeit -s 'string=bytes(range(256));table=bytes(range(255, -1, -1));delete=b"abcdefghijklmn"' 'string.translate(table, delete=delete)'
1000000 loops, best of 3: 0.771 usec per loop

From my observation, the difference between PyArg_ParseTupleAndKeywords and PyArg_ParseTuple when parsing positional arguments is very small. This means it won't make old code slowdown by a large percent. And when keyword argument is specified, there is a degrade. But I think this happens everywhere using PyArg_ParseTupleAndKeywords.
msg272771 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-15 15:46
Technically the patch looks correct to me. Added just few minor comments on Rietveld. I don't think there is a large need in adding the support of keyword argument. But since the overhead is small and somebody needs this, adding this doesn't do a harm. Left it on you Martin.
msg273008 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-08-18 01:00
I can look at enhancing the tests at some stage, but it isn’t a high priority for me.

Regarding translate() with no arguments, it makes sense if you see it as a kind of degenerate case of neither using a translation table, nor any set of bytes to delete:

x.translate() == x.translate(None, b"")

I admit it reads strange and probably isn’t useful. If people dislike it, it might be easiest to just add the keyword support and keep the first parameter as mandatory:

without_nulls = bytes_with_nulls.translate(None, delete=b"\x00")
msg273013 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-18 06:15
Martin, I write the v3 patch to apply the comments. It preserves *table* as mandatory and move the test_translate to BaseBytesTest to remove duplicates.
msg273337 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-08-22 06:07
Looks pretty good thanks Xiang. There’s one English grammar problem in a comment (see review), but I can fix that when I commit.
msg273768 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-08-27 09:09
New changeset 6ab1b54245d5 by Martin Panter in branch 'default':
Issue #27506: Support bytes/bytearray.translate() delete as keyword argument
https://hg.python.org/cpython/rev/6ab1b54245d5
msg273785 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-27 16:40
Yay, thanks for your work, Martin.
History
Date User Action Args
2016-08-27 16:40:58xiang.zhangsetmessages: + msg273785
2016-08-27 09:51:51martin.pantersetstatus: open -> closed
resolution: fixed
stage: commit review -> resolved
2016-08-27 09:09:54python-devsetnosy: + python-dev
messages: + msg273768
2016-08-22 06:07:28martin.pantersetmessages: + msg273337
stage: patch review -> commit review
2016-08-18 06:15:22xiang.zhangsetfiles: + bytes_translate_delete_as_keyword_arguments_v3.patch

messages: + msg273013
title: make bytes/bytearray delete a keyword argument -> make bytes/bytearray translate's delete a keyword argument
2016-08-18 01:00:28martin.pantersetmessages: + msg273008
2016-08-15 15:46:08serhiy.storchakasetassignee: serhiy.storchaka -> martin.panter
messages: + msg272771
2016-08-12 04:36:35xiang.zhangsetmessages: + msg272497
2016-08-11 23:48:21martin.pantersetmessages: + msg272486
2016-08-11 20:49:12serhiy.storchakasetmessages: + msg272482
2016-08-11 10:16:36martin.pantersetmessages: + msg272434
2016-07-23 15:52:07xiang.zhangsetmessages: + msg271090
2016-07-23 13:13:18martin.pantersetfiles: + table-optional-delete-empty.patch

messages: + msg271074
2016-07-14 02:52:27xiang.zhangsetmessages: + msg270354
2016-07-13 22:49:42martin.pantersetnosy: + martin.panter
messages: + msg270349
2016-07-13 15:38:38xiang.zhangsetfiles: + bytes_translate_delete_as_keyword_arguments_v2.patch

messages: + msg270318
2016-07-13 15:31:24r.david.murraysetnosy: + r.david.murray

messages: + msg270317
title: make bytes/bytearray deletechars a keyword argument named delete -> make bytes/bytearray delete a keyword argument
2016-07-13 14:25:56xiang.zhangsetmessages: + msg270310
2016-07-13 12:18:41r.david.murraysettitle: bytes/bytearray delete acts as keyword argument -> make bytes/bytearray deletechars a keyword argument named delete
2016-07-13 10:04:23serhiy.storchakasetassignee: serhiy.storchaka
stage: patch review
2016-07-13 09:48:57xiang.zhangcreate