This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: no longer possible to hash arrays
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: barry, benjamin.peterson, chuck, exarkun, ezio.melotti, georg.brandl, gregory.p.smith, ivank, loewis, pitrou
Priority: normal Keywords: patch

Created on 2009-05-20 17:17 by exarkun, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
array_new_buffer.patch chuck, 2009-09-25 22:37
hasharray.patch pitrou, 2009-09-29 10:56
hasharray.patch chuck, 2009-09-29 19:28
hashlib.diff loewis, 2009-11-23 08:37
Messages (31)
msg88122 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2009-05-20 17:17
It used to be possible to use hashlib with arrays; it no longer seems
possible.

exarkun@charm:~$ python -c '
import sys, hashlib, array
print sys.version_info
print hashlib.sha1(array.array("b", [1, 2, 3])).hexdigest()
'
(2, 5, 2, 'final', 0)
7037807198c22a7d2b0807371d763779a84fdfcf
exarkun@charm:~$ ~/Projects/python/trunk/python -c '
import sys, hashlib, array
print sys.version_info
print hashlib.sha1(array.array("b", [1, 2, 3])).hexdigest()
'
sys.version_info(major=2, minor=7, micro=0, releaselevel='alpha', serial=0)
Traceback (most recent call last):
  File "<string>", line 4, in <module>
TypeError: object supporting the buffer API required
exarkun@charm:~$
msg88123 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-05-20 17:45
That's probably because hashlib switched to using the new buffer API,
and array.array() only supports the old one in 2.x.

Perhaps hashlib should have some fallback code for such legacy types...
msg88209 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-05-22 19:38
exarkun: would you like to propose a patch?
msg88210 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2009-05-22 19:45
I would certainly like to, but unfortunately at present I am unable to.
msg88212 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2009-05-22 19:56
Perhaps Gregory has some idea about how this can most easily be
resolved, since I think he did the work on #3745 which introduced this
change in behavior.
msg88217 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-05-22 21:17
to confirm, this is only a problem in 2.7 trunk right?

if so, i won't rush a fix.

but yes fallback code for legacy types or adding buffer api support to 
array would work.
msg90470 - (view) Author: ivank (ivank) Date: 2009-07-13 07:45
It no longer works with objects created with buffer() either:

>>> hashlib.sha1(buffer("x")).hexdigest()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object supporting the buffer API required
msg90502 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-07-13 21:05
> It no longer works with objects created with buffer() either:

That's because the buffer objects don't implement the buffer protocol
(in particular, they don't define the Py_TPFLAGS_HAVE_NEWBUFFER flag,
and the bf_getbuffer/bf_releasebuffer operations that go with that flag.
msg92422 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2009-09-08 16:17
Can this ticket be marked as a release blocker so it's not forgotten
about for 2.7?
msg92428 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-09-08 20:06
Can't hurt :)
msg92922 - (view) Author: Jan (chuck) * Date: 2009-09-21 07:21
I tried to implement the new buffer API, but as soon as I add 
bf_getbuffer/bf_releasebuffer to PyBufferProcs writing an array to a file 
breaks:
  f.write(a)
  TypeError: must be contiguous buffer, not array.array

I searched through the file functions, but couldn't find the point where 
this happens. Has anybody a suggestion? Does file.write() use the old 
buffers? Doesn't it use reprfunc?
msg93137 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-09-25 20:35
> I tried to implement the new buffer API, but as soon as I add 
> bf_getbuffer/bf_releasebuffer to PyBufferProcs writing an array to a file 
> breaks:

You should take a look at the array module in py3k, it supports the new
buffer API. With a bit of luck, the two relevant functions shouldn't be
too difficult to backport to the trunk.

By the way, to signal that it supports the new buffer API, you must add
Py_TPFLAGS_HAVE_NEWBUFFER to the type flags.
msg93148 - (view) Author: Jan (chuck) * Date: 2009-09-25 22:37
I added the two functions for the new buffer API. Having an exported 
memory view needs some handling elsewhere, so the array does not change. I 
also added tests for checking that behaviour.

Mainly I copypasted code from py3k which involved redirecting to 
array_resize() instead of doing it manually and doing the checking there.
msg93149 - (view) Author: Jan (chuck) * Date: 2009-09-25 23:11
The patch breaks five unit tests from other modules, I'll look into it.
msg93156 - (view) Author: Jan (chuck) * Date: 2009-09-26 09:49
I stumbled upon the following function:

static Py_ssize_t
convertbuffer(PyObject *arg, void **p, char **errmsg)
in Python/getargs.c

The first thing the function does is checking if the object implements 
the old buffer api, but also fails if pb->bf_releasebuffer != NULL. So I 
guess it's also making sure the new buffer api is not implemented.

What's the thought behind this? Removing that condition fixes three of 
the failing tests but breaks none.
msg93157 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-09-26 11:30
> The patch breaks five unit tests from other modules, I'll look into it.

What are those tests?
msg93158 - (view) Author: Jan (chuck) * Date: 2009-09-26 11:37
> > The patch breaks five unit tests from other modules, I'll look into
> > it.
> 
> What are those tests?

test_codecs, test_ioctl, test_multiprocessing, test_socket and 
test_struct.
msg93160 - (view) Author: Jan (chuck) * Date: 2009-09-26 11:48
You asked "what" not "which" :)

test test_codecs failed -- Traceback (most recent call last):
  File "/Users/jan/src/python-svn/Lib/test/test_codecs.py", line 531, in 
test_array
    codecs.readbuffer_encode(array.array("c", "spam")),
TypeError: must be string or read-only buffer, not array.array


test test_ioctl failed -- Traceback (most recent call last):
  File "/Users/jan/src/python-svn/Lib/test/test_ioctl.py", line 34, in 
test_ioctl_mutate
    r = fcntl.ioctl(tty, termios.TIOCGPGRP, buf, 1)
TypeError: ioctl requires a file or file descriptor, an integer and 
optionally an integer or buffer argument


test test_multiprocessing failed -- Traceback (most recent call last):
  File "/Users/jan/src/python-svn/Lib/test/test_multiprocessing.py", 
line 1269, in test_connection
    self.assertEqual(conn.send_bytes(arr), None)
TypeError: must be string or read-only buffer, not array.array


test test_socket failed -- errors occurred; run in verbose mode for 
details
(have to recheck)


test test_struct failed -- Traceback (most recent call last):
  File "/Users/jan/src/python-svn/Lib/test/test_struct.py", line 468, in 
test_unpack_with_buffer
    value, = struct.unpack('>I', data)
error: unpack requires a string argument of length 4
msg93172 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-09-27 15:03
> The first thing the function does is checking if the object implements 
> the old buffer api, but also fails if pb->bf_releasebuffer != NULL. So I 
> guess it's also making sure the new buffer api is not implemented.
> 
> What's the thought behind this?

The idea is that if a type implements the bf_releasebuffer function, it
shouldn't be used in a situation where the caller won't try to release
the buffer.

It looks a bit unwarranted though, because a type implementing the old
buffer API should be able to function without the releasing anyway.
Otherwise it wouldn't implement that API at all.

Martin, ISTR you did that addition, would you be opposed to removing the
NULL check on bf_releasebuffer when trying to get a buffer through the
old buffer API?
msg93275 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-09-29 10:56
After thinking about it, we should remover the bf_releasebuffer checks
when using the old typecodes '#' (the doc should also clearly state that
these are unsafe and '*' is preferred).

If a type chooses to implement the /old/ buffer API and if a method
requires the /old/ buffer API instead of the new one (that is, by using
a '#' typecode), we have to accept these choices.

Here's a patch for trunk, combining the new buffer API for `array` and
the checks removal in getargs.c.
msg93301 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2009-09-29 16:31
Moving to deferred blocker to get this out of the way for 2.6.3
msg93309 - (view) Author: Jan (chuck) * Date: 2009-09-29 17:57
I was looking at the remaining differences between Modules/arraymodule.c 
in python 2.7 and the 3k branch, in particular I was testing why 
including the changes to the function array_ass_slice into the patch 
breaks the unit test of the array module.

The manual resizing of the memory basically was replaced by a call to 
array_resize with the advantage, that checking for exported memory views 
is done there (which is not necessary because it's already done in the 
function) and the code gets clearer. I think when PyMem_RESIZE is called 
the pointer to the memory might change. So in 3k this now happens in 
array_resize, so the array->ob_item pointer changes but not it's local 
copy in array_ass_slice. Isn't that potentially causing trouble in 
python 3k?
msg93320 - (view) Author: Jan (chuck) * Date: 2009-09-29 19:28
I fixed the array_ass_slice for python 2.7 in the attached patch.

The problem should apply to python 3k as well: firstly the above which 
might stay unnoticed and secondly the function moves to much memory if the 
array size is increased: (Py_SIZE(a)-d-ihigh) items should be moved, 
because Py_SIZE(a) was already modified by array_resize, but the function 
moves (Py_SIZE(a)-ihigh) items.

The attached patch for python 2.7 passes all unit tests for me.
msg93338 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-09-29 23:32
> I fixed the array_ass_slice for python 2.7 in the attached patch.

I would prefer if I opened a separate issue for this bug. That way it
will be easier to port the patch separately to py3k.
msg93339 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-09-29 23:41
> I would prefer if I opened a separate issue for this bug.

if /you/ opened, of course...
msg95615 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-11-23 08:37
I think the error is really in _hashlib, not in the array object. It
should not require 3.x style buffers, but continue to support 2.x
readbuffers. Attached is a patch that takes this route to fixing the bug.

As for the checks for bf_releasebuffer: I still think they are
necessary. If an object implements bf_releasebuffer, that means that the
object may change the buffer underneath, unless proper locking and
unlocking takes place. Indeed, the array's getreadbuf operation is not
thread-safe. It might be possible to remove them if it is clarified that
anybody calling getreadbuffer must not release the GIL while they hold
on to the buffer.
msg95650 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-23 19:51
> As for the checks for bf_releasebuffer: I still think they are
> necessary. If an object implements bf_releasebuffer, that means that the
> object may change the buffer underneath, unless proper locking and
> unlocking takes place.

I know, but the problem is that by switching some argument definitions
to "s*" and friends we have broken compatibility for the (admittedly
uncommon) use case of giving an array object to those functions. Since
we probably don't want to backout those changes perhaps adding support
for the new buffer API to the array object is the best course of action.
msg96105 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2009-12-08 02:44
Will this be fixed for Python 2.7 final?
msg97193 - (view) Author: ivank (ivank) Date: 2010-01-04 07:02
I believe this was fixed in r77252, which was fixing http://bugs.python.org/issue3745
msg97223 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2010-01-04 21:18
Even if it's fixed a patch with tests should be submitted before closing this issue.
msg97238 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-01-05 00:05
Done in r77313.
History
Date User Action Args
2022-04-11 14:56:49adminsetgithub: 50321
2010-01-05 00:05:02benjamin.petersonsetstatus: open -> closed

nosy: + benjamin.peterson
messages: + msg97238

resolution: fixed
2010-01-04 21:18:52ezio.melottisetpriority: deferred blocker -> normal

nosy: + ezio.melotti
messages: + msg97223

stage: needs patch -> test needed
2010-01-04 07:02:52ivanksetmessages: + msg97193
2009-12-08 02:44:37exarkunsetmessages: + msg96105
2009-11-23 19:51:23pitrousetmessages: + msg95650
2009-11-23 08:37:07loewissetfiles: + hashlib.diff

messages: + msg95615
2009-09-29 23:41:34pitrousetmessages: + msg93339
2009-09-29 23:32:44pitrousetmessages: + msg93338
2009-09-29 19:28:47chucksetfiles: + hasharray.patch

messages: + msg93320
2009-09-29 17:57:10chucksetmessages: + msg93309
2009-09-29 16:31:05barrysetpriority: release blocker -> deferred blocker
nosy: + barry
messages: + msg93301

2009-09-29 10:56:45pitrousetfiles: + hasharray.patch

messages: + msg93275
2009-09-27 15:03:33pitrousetmessages: + msg93172
2009-09-26 11:49:00chucksetmessages: + msg93160
2009-09-26 11:37:47chucksetmessages: + msg93158
2009-09-26 11:30:15pitrousetmessages: + msg93157
2009-09-26 09:49:15chucksetmessages: + msg93156
2009-09-25 23:11:19chucksetmessages: + msg93149
2009-09-25 22:37:38chucksetfiles: + array_new_buffer.patch
keywords: + patch
messages: + msg93148
2009-09-25 20:35:32pitrousetmessages: + msg93137
2009-09-21 07:21:57chucksetnosy: + chuck
messages: + msg92922
2009-09-08 20:06:20georg.brandlsetpriority: high -> release blocker
nosy: + georg.brandl
messages: + msg92428

2009-09-08 16:17:39exarkunsetmessages: + msg92422
2009-07-13 21:05:23loewissetmessages: + msg90502
2009-07-13 07:45:33ivanksetmessages: + msg90470
2009-05-22 21:17:51gregory.p.smithsetassignee: gregory.p.smith
messages: + msg88217
2009-05-22 19:56:17exarkunsetnosy: + gregory.p.smith
messages: + msg88212
2009-05-22 19:45:10exarkunsetmessages: + msg88210
2009-05-22 19:38:39loewissetnosy: + loewis
messages: + msg88209
2009-05-20 17:45:17pitrousetpriority: high

nosy: + pitrou
messages: + msg88123

stage: needs patch
2009-05-20 17:26:51ivanksetnosy: + ivank
2009-05-20 17:17:13exarkuncreate