classification
Title: PEP 467 -- Minor API improvements for binary sequences
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Rondom, elias, ethan.furman, martin.panter, ncoghlan, ned.deily, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2016-09-01 00:56 by elias, last changed 2017-11-14 18:24 by serhiy.storchaka.

Files
File name Uploaded Description Edit
pep467.patch elias, 2016-09-01 00:56 Patch with PEP 467 changes
pep467_doc_changes.patch elias, 2016-09-01 01:18
pep467.patch martin.panter, 2016-09-01 07:07 Regenerated review
pep467_doc_changes.patch martin.panter, 2016-09-01 07:08 review
no_bytes_from_int.patch serhiy.storchaka, 2016-09-01 18:19 review
pep467_attempt2.patch elias, 2016-09-01 19:24 review
Pull Requests
URL Status Linked Edit
PR 3237 open elias, 2017-08-29 18:19
Messages (24)
msg274082 - (view) Author: Elias Zamaria (elias) * Date: 2016-09-01 00:56
This is my attempt at implementing PEP 467.

I am not an expert in the details of the Python interpreter, and this is my first time working on a big project in C, so I am not sure if I am doing things in the most elegant or efficient way, but it seems to work fine, as far as I can tell.

I have added some tests for the new functionality. I am planning to work on changes to the documentation some time in the next few days.

I noticed a lot of places that are using the deprecated integer-argument bytes and bytearray constructors. I left most of them alone, although I changed a few of them to the zeros constructors to prevent certain tests from failing.
msg274084 - (view) Author: Elias Zamaria (elias) * Date: 2016-09-01 01:18
Here is a patch with some changes to the docs. I don't know if the descriptions are good enough, but it should hopefully identify what parts of the docs need to be changed.
msg274085 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-09-01 01:27
FYI if you rebase your patches onto public Mercurial revisions, they should get a review link
msg274098 - (view) Author: Elias Zamaria (elias) * Date: 2016-09-01 05:47
@martin.panter, I am familiar with Mercurial, and with the concept of rebasing, but I don't understand what you are trying to tell me.

I made these changes in several local branches, and then merged them all together in one branch. How can I rebase it onto a public revision? Can I just do it, or do I need some sort of special privileges?
msg274099 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-09-01 07:07
Here is what I did. (You could probably do this yourself, but never mind :)

hg pull -r{default,2.7,3.5}
hg update default  # Update to the latest public revision
hg import --no-commit "$(xclip -o)"  # Apply your patch on top
hg diff -p > pep467.patch  # Make a new diff

If you look inside the patch file, the difference is that the Reitveld review system cannot handle “diff -r 3e41c0449b9c” from your file, but hopefully will know about “diff -r ebb23744a36c” in my version.
msg274109 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-09-01 09:20
Thanks for the patches. I left you some comments in the code review.

I expect there would be more bits of the docs that need fixing. Doc/library/functions.rst definitely; the tutorial is worth checking too. There may also be example code floating around using the deprecated calls.

Regarding code that uses the deprecated call, as well as blind searching (grep), you could try running the entire test suite with -Werror. That should catch a lot of them.

In the RST documentation, I think you missed adding entries for the zeros() and byte() constructor methods. Maybe use the other class methods maketrans(), fromhex() as a starting point.

Also, this will need an entry written for Doc/whatsnew/3.6.rst.
msg274155 - (view) Author: Elias Zamaria (elias) * Date: 2016-09-01 17:12
I tried running `hg import --no-commit "$(xclip -o)"` and got the following result:

bash: xclip: command not found
abort: need at least one patch to import

I am using OS X 10.11.6 and Mercurial 3.8.2. I did a bit of quick research on xclip, but it looks like yet another unfamiliar tool, and I don't know exactly what it does or how it is supposed to help me.

I looked at your suggestions for my patch, and I am working on them.

I am planning to look at the tutorial, and to look for, and maybe fix, deprecated calls, but so far, I don't have a clear idea how long it will take me, or if it is realistic for one person to do it.
msg274161 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-09-01 18:19
I'm not convinced that the zeros() method is needed. Zero-initialized sequences can be created via sequence repetition. Sorry, but arguments against this sound like "we made bad design decision in the past, let repeat it with new name" to me.

In any case if you want to write a code for different Python versions you should use sequence repetition.

Here is a patch that replaces all creations of zero-initialized bytes and bytearray objects in the stdlib and tests with repetitions. All tests (except test_bytes of course) are passed on Linux if disallow an int argument in bytes and bytearray constructors.
msg274165 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2016-09-01 18:37
Two things to note:

- there is no need to change the stdlib to use anything besides the default
  constructor -- it's not going away, and it already works (my apologies if
  I misunderstood)

- PEP 467 has not yet been accepted (unless I missed that?)

However, thank you for getting some code written!
msg274170 - (view) Author: Elias Zamaria (elias) * Date: 2016-09-01 19:24
Here is a patch with all of my latest changes, including the changes that Martin suggested for the tests.
msg274291 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-09-03 12:27
Sorry Elias for not explaining my commands. xclip was just my way of pasting the patch URL from the clipboard (that I copied from Firefox).

It is not clear if we are actually making a DeprecationWarning, or going down the road of changing all the stdlib. But if we do, there are a couple of bits of Serhiy’s patch that I would rewrite more plainly, e.g. bytearray(b'\0\0') * len(...) instead of bytearray(b'\0') * (2 * len(...)).
msg274388 - (view) Author: Elias Zamaria (elias) * Date: 2016-09-05 04:26
Martin, where am I supposed to get the patch URL from?

Also, is it too soon to issue DeprecationWarnings? Would it be more appropriate if they are PendingDeprecationWarnings instead?
msg274397 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-09-05 12:31
Elias, the patch URL is where I downloaded your patch from; see the list of patches at the top of this bug:

hg import --no-commit https://bugs.python.org/file44316/pep467.patch

Anyway in this email <http://mid.mail-archive.com/57CAEF46.6000402@stoneleaf.us> it seems Ethan only wants to document that the old constructor call is deprecated, and not issue a DeprecationWarning.
msg276074 - (view) Author: Elias Zamaria (elias) * Date: 2016-09-12 17:22
Martin, I made another attempt to understand what you are trying to tell me about the patch, and I'm still confused.

It seems that to make the patch merge cleanly, I need to get the patch URL from this page, and to get the patch URL, I need to upload the patch, and to upload the patch, I need to produce the patch, and to produce a patch that merges cleanly, I need to get the patch URL. That sounds like a catch-22. Where am I supposed to start?
msg276078 - (view) Author: Elias Zamaria (elias) * Date: 2016-09-12 17:32
I looked through the email thread. I can remove the DeprecationWarnings, but before I do that, I would like to ask something: How "official" are the things discussed in the thread? Are they supposed to be part of the PEP? Are they supposed to be final, definite decisions, or are they things that may change soon?
msg276084 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2016-09-12 18:10
The PEP 467 has not been accepted yet, so nothing is final.  The email thread is the current discussion.

Also, unless someone has received permission from Ned, this won't go in until Python 3.7.
msg277059 - (view) Author: Elias Zamaria (elias) * Date: 2016-09-20 19:32
Ethan, by "Ned", I am guessing that you are referring to Ned Batchelder. Is that right? If so, do we need to put him on the nosy list or do anything else to bring this to his attention?
msg277304 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2016-09-23 22:12
I agree with Ethan that the PEP needs to be accepted first and afterwards, unless there is a very strong case made quickly, it would not qualify for a late-feature exemption for 360b2.

Nick, any updates on the status of PEP 467?
msg277377 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-09-25 14:18
Just what Ethan noted above, and the fact that Ethan should also list himself as a co-author now :)

That said, one major alternative it would be good to consider further between now and 3.7 is Serhiy's "seqtools" idea: https://mail.python.org/pipermail/python-ideas/2016-July/041083.html

Something that wrapped memoryview and allowed bytes-based iteration over any buffer exported would automatically cover bytes and bytearray as well, while also permitting "structure ignoring" iteration over the raw data contained in other types.
msg278497 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-10-11 16:43
Something else the PEP needs to be updated to cover is that in 3.5+ (and maybe even in 3.4 - I'm not sure when 'c' support landed in memoryview) you can write the following efficient bytes iterator:

    def iterbytes(data):
        return iter(memoryview(data).cast('c'))

And use it as so:

>>> data = b"123"
>>> for datum in iterbytes(data):
...     print(datum)
... 
b'1'
b'2'
b'3'

That will then work with any buffer exporter, not just bytes and bytearray, and since it's a function, you can easily make a similar function that's polymorphic on str -> str iterator and bytes-like -> bytes iterator.
msg278498 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-10-11 17:38
Even in 3.3+.
msg278504 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-10-11 20:36
For arbitrary C-contiguous buffers aka “bytes-like objects” (which are not just arrays of bytes), I think this trick relies on Issue 15944, which is only added in 3.5+.
msg306235 - (view) Author: Elias Zamaria (elias) * Date: 2017-11-14 18:05
Can someone here merge my pull request? If not, then what else needs to be done for my change to be included in 3.7?
msg306236 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-11-14 18:24
I don't think these changes should be merged.

For bytes.zeros(N), you can use b'\0' * N.

For iterbytes() methods, I think that the general function which works with any sequence that support slicing would be more useful.
History
Date User Action Args
2017-11-14 18:24:14serhiy.storchakasetmessages: + msg306236
2017-11-14 18:05:18eliassetmessages: + msg306235
2017-08-29 18:19:35eliassetpull_requests: + pull_request3279
2016-10-11 20:36:14martin.pantersetmessages: + msg278504
2016-10-11 17:38:00serhiy.storchakasetmessages: + msg278498
2016-10-11 16:43:35ncoghlansetmessages: + msg278497
2016-10-11 16:17:26Rondomsetnosy: + Rondom
2016-09-25 14:18:54ncoghlansetmessages: + msg277377
2016-09-23 22:12:26ned.deilysetnosy: + ned.deily, ncoghlan

messages: + msg277304
versions: + Python 3.7, - Python 3.6
2016-09-20 19:32:43eliassetmessages: + msg277059
2016-09-12 18:10:15ethan.furmansetmessages: + msg276084
2016-09-12 17:32:31eliassetmessages: + msg276078
2016-09-12 17:22:25eliassetmessages: + msg276074
2016-09-05 12:31:08martin.pantersetmessages: + msg274397
2016-09-05 04:26:18eliassetmessages: + msg274388
2016-09-03 12:27:24martin.pantersetmessages: + msg274291
2016-09-01 19:25:00eliassetfiles: + pep467_attempt2.patch

messages: + msg274170
2016-09-01 18:37:26ethan.furmansetmessages: + msg274165
2016-09-01 18:19:14serhiy.storchakasetfiles: + no_bytes_from_int.patch
nosy: + serhiy.storchaka
messages: + msg274161

2016-09-01 17:12:39eliassetmessages: + msg274155
2016-09-01 09:20:00martin.pantersetmessages: + msg274109
2016-09-01 07:08:34martin.pantersetfiles: + pep467_doc_changes.patch
2016-09-01 07:07:45martin.pantersetfiles: + pep467.patch

messages: + msg274099
2016-09-01 05:47:33eliassetmessages: + msg274098
2016-09-01 01:27:18martin.pantersetnosy: + martin.panter

messages: + msg274085
stage: patch review
2016-09-01 01:18:06eliassetfiles: + pep467_doc_changes.patch

messages: + msg274084
2016-09-01 01:02:06ethan.furmansetnosy: + ethan.furman
2016-09-01 00:56:19eliascreate