Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make bytes/bytearray translate's delete a keyword argument #71693

Closed
zhangyangyu opened this issue Jul 13, 2016 · 18 comments
Closed

make bytes/bytearray translate's delete a keyword argument #71693

zhangyangyu opened this issue Jul 13, 2016 · 18 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@zhangyangyu
Copy link
Member

BPO 27506
Nosy @bitdancer, @vadmium, @serhiy-storchaka, @zhangyangyu
Files
  • bytes_translate_delete_as_keyword_arguments.patch
  • bytes_translate_delete_as_keyword_arguments_v2.patch
  • table-optional-delete-empty.patch
  • bytes_translate_delete_as_keyword_arguments_v3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/vadmium'
    closed_at = <Date 2016-08-27.09:51:51.633>
    created_at = <Date 2016-07-13.09:48:57.255>
    labels = ['interpreter-core', 'type-feature']
    title = "make bytes/bytearray translate's delete a keyword argument"
    updated_at = <Date 2016-08-27.16:40:58.132>
    user = 'https://github.com/zhangyangyu'

    bugs.python.org fields:

    activity = <Date 2016-08-27.16:40:58.132>
    actor = 'xiang.zhang'
    assignee = 'martin.panter'
    closed = True
    closed_date = <Date 2016-08-27.09:51:51.633>
    closer = 'martin.panter'
    components = ['Interpreter Core']
    creation = <Date 2016-07-13.09:48:57.255>
    creator = 'xiang.zhang'
    dependencies = []
    files = ['43703', '43705', '43841', '44138']
    hgrepos = []
    issue_num = 27506
    keywords = ['patch']
    message_count = 18.0
    messages = ['270303', '270310', '270317', '270318', '270349', '270354', '271074', '271090', '272434', '272482', '272486', '272497', '272771', '273008', '273013', '273337', '273768', '273785']
    nosy_count = 5.0
    nosy_names = ['r.david.murray', 'python-dev', 'martin.panter', 'serhiy.storchaka', 'xiang.zhang']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue27506'
    versions = ['Python 3.6']

    @zhangyangyu
    Copy link
    Member Author

    Write a patch to make bytes/bytearray.translate's delete argument support acting as keyword arguments. This won't break any backwards compatibility and make the method more flexible to use. Besides, in the C code level, it stops using argument clinic's legacy optional group feature and removes the unnecessary group_right_1 parameter.

    @zhangyangyu zhangyangyu added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Jul 13, 2016
    @serhiy-storchaka serhiy-storchaka self-assigned this Jul 13, 2016
    @bitdancer bitdancer changed the title bytes/bytearray delete acts as keyword argument make bytes/bytearray deletechars a keyword argument named delete Jul 13, 2016
    @zhangyangyu
    Copy link
    Member Author

    Hmm, David, that may be not quite right. Users only reading the doc never know it's deletechars not delete. The doc is always delete, though conflicting with __doc__.

    >>> print(bytes.translate.__doc__)
    B.translate(table[, deletechars]) -> bytes
    ...

    I deliberately change deletechars to delete to keep consistent with doc. But actually I think using deletechars won't break backwards compatibility too.

    @bitdancer
    Copy link
    Member

    Ah, I was looking at the 2.7 docs.

    @bitdancer bitdancer changed the title make bytes/bytearray deletechars a keyword argument named delete make bytes/bytearray delete a keyword argument Jul 13, 2016
    @zhangyangyu
    Copy link
    Member Author

    Please review the new version. It makes two changes comparing with the last one.

    1. It exposes Python parameter as "delete" (which the document always uses so I think it's the API) while still use "deletechars" (which I prefer as a C variable name) in C code.

    2. It allows *delete* to be None. Before this is not allowed but I don't think this change breaks backwards compatibility. The reason for this change is that I don't want users to get surprised when they pass the default value to translate but then get an exception.

    @vadmium
    Copy link
    Member

    vadmium commented Jul 13, 2016

    Instead of allowing delete=None (which is not in the RST documentation), perhaps it is possible to change the doc string. I can’t remember the details, but I think Argument Clinic allows a virtual Python-level default, something like “object(py_default=b"") = NULL”.

    Also, I think I like the change. What do you think about making the first argument optional (default to None), allowing calls like x.translate(delete=b'aeiou')?

    @zhangyangyu
    Copy link
    Member Author

    Thanks for your comment Martin. I'll apply them later when we reach agreement on functions.

    I have already used object = NULL, the C default is not necessary here, and it works as you like I think. In patch version 1, b'abc'.translate(None, None) raises exception as before. I change it in patch version 2 because argument clinic generates function signature as "($self, table, /, delete=None)". So I don't want users get surprised when they provide None as the signature but get an exception. And using None as a placeholder for a keyword argument is normal in Python. But I'm OK to keep the previous behaviour and actually I prefer that.

    As for making the first argument optional, I don't quite like that since the doc seems to encourage users to set None explicitly.

    @vadmium
    Copy link
    Member

    vadmium commented Jul 23, 2016

    This patch is what I had in mind for setting the documented default as delete=b'', but using NULL internally.

    I also changed it to allow the table argument to be omitted. We can change the documentation accordingly. These are just suggestions; use either or both aspects as you please :)

    @zhangyangyu
    Copy link
    Member Author

    LGTM. Using b'' instead of the None as the default value of *delete* looks better since it doesn't break backwards compatibility. As for the first argument optional or not, actually it's both okay. You have changed the doc accordingly.

    @vadmium
    Copy link
    Member

    vadmium commented Aug 11, 2016

    Serhiy, you assigned this to yourself. What do you think of my patch?

    @serhiy-storchaka
    Copy link
    Member

    PyArg_ParseTupleAndKeywords can be slower than PyArg_ParseTuple even for positional arguments. We need benchmarking results (especially after committing a patch for bpo-27574).

    What is the purpose of adding support of the delete argument as keyword arguments? It looks to me, that the only purpose is allowing to specify the delete argument without specifying the table argument. There are two alternative ways to achieve this: make translate() accepting some special value (e.g. None) as the default value for the first argument:

    b'hello'.translate(None, b'l')
    

    or make translate() accepting the delete argument as keyword argument:

    b'hello'.translate(delete=b'l')
    

    The patch does both things, but only one is needed. If add the support of the delete argument as keyword argument, I would prefer to not add the support of None as the first argument, but would specify its default value as bytes(range(256)):

    table: object(c_default="NULL") = bytes(range(256))
    /
    delete as deletechars: object(c_default="NULL") = b''
    

    I don't know why optional group was used here, the function could be implemented without it.

    @vadmium
    Copy link
    Member

    vadmium commented Aug 11, 2016

    I agree it would be worth checking for a slowdown.

    As well as giving the option of omitting the table argument, it would make call sites easier to read. It would avoid suggesting that the first argument is translated to the second, like maketrans().

    data = data.translate(YENC_TABLE, delete=b"\r\n")

    Translate() already accepts None as the first argument; this is not new:

    >>> b"hello".translate(None, b"l")
    b'heo'

    I guess the optional group was used as a way of making the second argument optional without a specific default value.

    @zhangyangyu
    Copy link
    Member Author

    So let's do a simple benchmark.

    # without patch

    ./python -m timeit -s 'string=bytes(range(256));table=bytes(range(255, -1, -1));delete=b"abcdefghijklmn"' 'string.translate(table, delete)'
    1000000 loops, best of 3: 0.55 usec per loop

    # with patch

    ./python -m timeit -s 'string=bytes(range(256));table=bytes(range(255, -1, -1));delete=b"abcdefghijklmn"' 'string.translate(table, delete)'
    1000000 loops, best of 3: 0.557 usec per loop

    # keyword specified

    ./python -m timeit -s 'string=bytes(range(256));table=bytes(range(255, -1, -1));delete=b"abcdefghijklmn"' 'string.translate(table, delete=delete)'
    1000000 loops, best of 3: 0.771 usec per loop

    From my observation, the difference between PyArg_ParseTupleAndKeywords and PyArg_ParseTuple when parsing positional arguments is very small. This means it won't make old code slowdown by a large percent. And when keyword argument is specified, there is a degrade. But I think this happens everywhere using PyArg_ParseTupleAndKeywords.

    @serhiy-storchaka
    Copy link
    Member

    Technically the patch looks correct to me. Added just few minor comments on Rietveld. I don't think there is a large need in adding the support of keyword argument. But since the overhead is small and somebody needs this, adding this doesn't do a harm. Left it on you Martin.

    @vadmium
    Copy link
    Member

    vadmium commented Aug 18, 2016

    I can look at enhancing the tests at some stage, but it isn’t a high priority for me.

    Regarding translate() with no arguments, it makes sense if you see it as a kind of degenerate case of neither using a translation table, nor any set of bytes to delete:

    x.translate() == x.translate(None, b"")

    I admit it reads strange and probably isn’t useful. If people dislike it, it might be easiest to just add the keyword support and keep the first parameter as mandatory:

    without_nulls = bytes_with_nulls.translate(None, delete=b"\x00")

    @zhangyangyu
    Copy link
    Member Author

    Martin, I write the v3 patch to apply the comments. It preserves *table* as mandatory and move the test_translate to BaseBytesTest to remove duplicates.

    @zhangyangyu zhangyangyu changed the title make bytes/bytearray delete a keyword argument make bytes/bytearray translate's delete a keyword argument Aug 18, 2016
    @vadmium
    Copy link
    Member

    vadmium commented Aug 22, 2016

    Looks pretty good thanks Xiang. There’s one English grammar problem in a comment (see review), but I can fix that when I commit.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 27, 2016

    New changeset 6ab1b54245d5 by Martin Panter in branch 'default':
    Issue bpo-27506: Support bytes/bytearray.translate() delete as keyword argument
    https://hg.python.org/cpython/rev/6ab1b54245d5

    @vadmium vadmium closed this as completed Aug 27, 2016
    @zhangyangyu
    Copy link
    Member Author

    Yay, thanks for your work, Martin.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants