Message67222
On 2008-05-23 05:38, Raymond Hettinger wrote:
> Raymond Hettinger <rhettinger@users.sourceforge.net> added the comment:
>
> I don't think this is the right thing to do. The hash algorithms are
> defined in terms of bytes, but Unicode is an abstracted from a byte
> level encoding. It doesn't make sense to convert using an arbitrary
> encoding (such as UTF-8) because someone else might hash the same text
> using a different encoding.
>
> Marc, do you concur?
Yes.
While we could fix an encoding to use for converting Unicode to
bytes, e.g. UTF-8, you clearly want hash functions to be portable
across platforms, programming languages and implementations.
Other languages or implementations might choose UTF-16 or some
other encoding, so it's not clear which encoding to choose and
there doesn't seem to be a standard for this either.
-1 on the idea. Martin already closed and rejected the idea for me.
Thanks,
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, May 23 2008)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611 |
|
Date |
User |
Action |
Args |
2008-05-23 08:32:58 | lemburg | set | spambayes_score: 0.000493089 -> 0.00049308926 recipients:
+ lemburg, loewis, rhettinger, vvro |
2008-05-23 08:32:50 | lemburg | link | issue2948 messages |
2008-05-23 08:32:44 | lemburg | create | |
|