Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_sha256 et al. encode to UTF-8 by default #47995

Closed
hagen mannequin opened this issue Sep 1, 2008 · 23 comments
Closed

_sha256 et al. encode to UTF-8 by default #47995

hagen mannequin opened this issue Sep 1, 2008 · 23 comments
Assignees
Labels
extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error

Comments

@hagen
Copy link
Mannequin

hagen mannequin commented Sep 1, 2008

BPO 3745
Nosy @malemburg, @gpshead, @pitrou, @vstinner, @kmtracey

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/gpshead'
closed_at = <Date 2010-01-03.00:48:11.239>
created_at = <Date 2008-09-01.09:27:05.177>
labels = ['extension-modules', 'type-bug']
title = '_sha256 et al. encode to UTF-8 by default'
updated_at = <Date 2010-01-04.11:49:27.468>
user = 'https://bugs.python.org/hagen'

bugs.python.org fields:

activity = <Date 2010-01-04.11:49:27.468>
actor = 'lemburg'
assignee = 'gregory.p.smith'
closed = True
closed_date = <Date 2010-01-03.00:48:11.239>
closer = 'gregory.p.smith'
components = ['Extension Modules']
creation = <Date 2008-09-01.09:27:05.177>
creator = 'hagen'
dependencies = []
files = []
hgrepos = []
issue_num = 3745
keywords = ['26backport']
message_count = 23.0
messages = ['72220', '73550', '79112', '79115', '81726', '81738', '81739', '81740', '81741', '81820', '81858', '96431', '96501', '96934', '96935', '96936', '96991', '96995', '97151', '97152', '97153', '97202', '97203']
nosy_count = 8.0
nosy_names = ['lemburg', 'gregory.p.smith', 'pitrou', 'vstinner', 'kmtracey', 'hagen', 'rpetrov', 'ebfe']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue3745'
versions = ['Python 2.7']

@hagen
Copy link
Mannequin Author

hagen mannequin commented Sep 1, 2008

Whereas openssl-based _hashlib refuses to accept unencoded strings:

>>> _hashlib.openssl_sha256("\xff")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object supporting the buffer API required

the _sha256 version encodes to UTF-8 by default:

>>> _sha256.sha256("\xff").digest() ==
_sha256.sha256("\xff".encode("utf-8")).digest()
True

I think refusing is better, but at least the behaviour should be
consistent. Same for the other algorithms in hashlib.

@hagen hagen mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Sep 1, 2008
@gpshead
Copy link
Member

gpshead commented Sep 22, 2008

agreed. most platforms should be using the openssl version, i will
update the non-openssl implementations to behave the same.

I don't think this is worth being a release blocker. I'll do it for 3.0.1.

@gpshead gpshead self-assigned this Sep 22, 2008
@hagen
Copy link
Mannequin Author

hagen mannequin commented Jan 5, 2009

Seems that this problem is being taken care of in issue bpo-4751.

@ebfe
Copy link
Mannequin

ebfe mannequin commented Jan 5, 2009

solved in bpo-4818 and bpo-4821

@gpshead
Copy link
Member

gpshead commented Feb 12, 2009

Fixed in py3k branch r69524.

needs porting to release30-maint.

possibly also release26-maint and trunk.

@pitrou
Copy link
Member

pitrou commented Feb 12, 2009

I don't think backporting to 2.6 is fine, people may be relying on the
current behaviour.
As for 3.0.1, you'd better be quick, it's scheduled for tomorrow.

@vstinner
Copy link
Member

Wooops, my mouse clicked on Remove!? I removed Message73550, sorry
gregory. Here was the content of the message:
---
agreed. most platforms should be using the openssl version, i will
update the non-openssl implementations to behave the same.

I don't think this is worth being a release blocker. I'll do it for
3.0.1.
---

@vstinner
Copy link
Member

I agree with pitrou: leave python 2.6 unchanged, but please backport
to 3.0.1 ;-)

@vstinner
Copy link
Member

gpolo gave me the solution to restore a deleted message:
http://bugs.python.org/issueXXXX?@action=edit&@add@messages=MSGNUM

@gpshead
Copy link
Member

gpshead commented Feb 12, 2009

fixed in release30-maint r69555.

sounds like its out of the question for 2.6. i will backport it to
trunk.

@gpshead
Copy link
Member

gpshead commented Feb 13, 2009

fixed in trunk r69561.

@gpshead gpshead added extension-modules C modules in the Modules dir and removed stdlib Python modules in the Lib dir labels Feb 13, 2009
@gpshead gpshead closed this as completed Feb 13, 2009
@malemburg
Copy link
Member

Gregory, this patch should not have been backported to Python 2.7. See issue

Could you please revert the change on trunk ? Thanks.

A much better solution would be to issue a -3 warning in case a Unicode
object is passed to the hash functions. However, this is major work to
get right, since the "s#" parser marker also accepts buffer interfaces.

@malemburg malemburg reopened this Dec 15, 2009
@rpetrov
Copy link
Mannequin

rpetrov mannequin commented Dec 16, 2009

What about inconsistent module build - as is reported some platform
build sha256 module that support unicode but most it is not build if
openssl is version 0.8+. Same for sha512 module.
If unicode for hashlib is not acceptable for trunk than why is not build
always sha{256|512} without to check for openssl version number ?

@gpshead
Copy link
Member

gpshead commented Dec 28, 2009

lemburg - see which issue #?

Anyways perhaps the right thing to do instead of trunk r65961 would have
been to change the s# to an s*.

Undoing it will be more painful now as several changes have gone in since
that require undoing and possibly redoing differently.

@gpshead
Copy link
Member

gpshead commented Dec 28, 2009

rpetrov - I couldn't really understand your message so I'm not sure if I'm
answering the right things: yes both the openssl and non-openssl modules
need to behave identically. the reason openssl is used when possible is
that its optimized hash functions are several times faster than the plain
C versions in the individual modules.

@kmtracey
Copy link
Mannequin

kmtracey mannequin commented Dec 28, 2009

I think the missing issue reference is to this thread on python-dev:

http://mail.python.org/pipermail/python-dev/2009-December/094574.html

@rpetrov
Copy link
Mannequin

rpetrov mannequin commented Dec 29, 2009

gregory - refer to setup.py logic to build modules

@malemburg
Copy link
Member

Gregory P. Smith wrote:

Gregory P. Smith <greg@krypto.org> added the comment:

lemburg - see which issue #?

Sorry, the message got truncated for some reason.

I was referring to http://bugs.python.org/issue3745

This was discussed on python-dev: http://mail.python.org/pipermail/python-dev/2009-December/094593.html

Anyways perhaps the right thing to do instead of trunk r65961 would have
been to change the s# to an s*.

That would have worked as well.

Undoing it will be more painful now as several changes have gone in since
that require undoing and possibly redoing differently.

Using s* should pretty much avoid the need to use GET_BUFFER_VIEW_OR_ERROUT(),
so if you want to keep the other changes, removing the use of the
macro should be fairly straight-forward, unless I'm missing something.

@gpshead
Copy link
Member

gpshead commented Jan 2, 2010

trunk r77252 switches python 2.7 to use 's*' for argument parsing. unicodes can be hashed (encoded to the system default encoding by s*) again.

This change has been blocked from being merged into py3k unless someone decides we actually want this magic unicode encoding behavior to exist there as well.

setup.py has also been updated to compile all versions of the hash algorithm modules when Py_DEBUG is defined. I'll update tests run on all implementations next so that it is easier for developers to maintain identical behavior across all implementations without needing to explicitly remember to reconfigure their setup and test those.

@gpshead
Copy link
Member

gpshead commented Jan 2, 2010

In order to get a -3 PyErr_WarnPy3k warning for unicode being passed to hashlib objects (a nice idea) I suggest creating an additonal 's*' like thing ('s3' perhaps?) in Python/getargs.c for that purpose rather than modifying all of the hashlib modules to accept an O, type check it and warn, and then re-parse it as a s* (that'd be a lot of tedious code duplication).

@gpshead
Copy link
Member

gpshead commented Jan 3, 2010

I believe everything in here has been addressed. Please open new issues with details for anything that doesn't quite right.

@gpshead gpshead closed this as completed Jan 3, 2010
@malemburg
Copy link
Member

Gregory P. Smith wrote:

Gregory P. Smith <greg@krypto.org> added the comment:

trunk r77252 switches python 2.7 to use 's*' for argument parsing. unicodes can be hashed (encoded to the system default encoding by s*) again.

This change has been blocked from being merged into py3k unless someone decides we actually want this magic unicode encoding behavior to exist there as well.

Thanks for updating the implementation.

@malemburg
Copy link
Member

Gregory P. Smith wrote:

Gregory P. Smith <greg@krypto.org> added the comment:

In order to get a -3 PyErr_WarnPy3k warning for unicode being passed to hashlib objects (a nice idea) I suggest creating an additonal 's*' like thing ('s3' perhaps?) in Python/getargs.c for that purpose rather than modifying all of the hashlib modules to accept an O, type check it and warn, and then re-parse it as a s* (that'd be a lot of tedious code duplication).

Good idea. We're likely going to need this in more places, so I'm +1 on
adding an "s3" parser marker.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants