classification
Title: _winreg.EnumValue fails when the registry data includes multibyte unicode characters
Type: behavior
Components: Windows Versions: Python 3.0, Python 2.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: loewis, stutzbach
Priority: Keywords:

Created on 2008-05-10 17:37 by stutzbach, last changed 2008-05-11 19:21 by loewis.

Messages
msg66542 (view) Author: Daniel Stutzbach (stutzbach) Date: 2008-05-10 17:37
_winreg.EnumValue raises a WindowsError ("More data is available") if
the registry data includes multibyte unicode characters.

Inspecting PyEnumValue in _winreg.c, I believe I see the problem.  The
function uses RegQueryInfoKey to determine the maximum data and key name
sizes to pass to RegEnumValue.

Unfortunately, RegQueryInfoKey returns the size in number of unicode
characters, while RegEnumValue expects a size in bytes.  This is OK if
all the values are ASCII, but it fails if there are any multibyte
unicode characters.

I believe it would be sufficient to multiply the sizes by 4, since
that's the maximum width of a unicode character.

The bug exists in at least Python 2.5 and Python 3.0 (based on source
code inspection).

References:

RegEnumValue: http://msdn.microsoft.com/en-us/library/ms724865(VS.85).aspx

RegQueryInfoKey:
http://msdn.microsoft.com/en-us/library/ms724902(VS.85).aspx
msg66547 (view) Author: Martin v. Löwis (loewis) Date: 2008-05-10 17:52
Is that for Python 2.5 or 3.0?
msg66553 (view) Author: Daniel Stutzbach (stutzbach) Date: 2008-05-10 18:14
The bug is in both.

On Sat, May 10, 2008 at 12:52 PM, Martin v. Löwis
<report@bugs.python.org> wrote:
>
> Martin v. Löwis <martin@v.loewis.de> added the comment:
>
> Is that for Python 2.5 or 3.0?
>
> ----------
> nosy: +loewis
>
> __________________________________
> Tracker <report@bugs.python.org>
> <http://bugs.python.org/issue2810>;
> __________________________________
>
msg66563 (view) Author: Martin v. Löwis (loewis) Date: 2008-05-10 18:57
Can you please provide a test case then? The 3.0 code doesn't use
RegQueryInfoKey, but RegQueryInfoKeyW.
msg66649 (view) Author: Daniel Stutzbach (stutzbach) Date: 2008-05-11 18:39
After several failed attempts at making a test case, and stepping
through C code with a debugger, I see that my initial diagnose is
quite wrong.  RegQueryInfoKey *does* return the sizes in units of
bytes (even though the Microsoft documentation says otherwise).  My
apologies.

I do still have a stack trace from an end-user of my python2.5-based
product, showing that _winreg.EnumValue raises:
WindowsError: [Error 234] More data is available

The application reliably crashes on start-up for this user, when
trying to read some registry entries written by another program and
hitting the above exception.

Unfortunately, I have been unable to reproduce the problem locally.  I
tried a variety of Unicode characters (including some that encode to 4
bytes), and that didn't raise an exception.  I also tried putting some
very long data strings (more than 64kb) into the registry, and that
worked fine too (even though the Microsoft documentation says the ANSI
version *should* return the above exception!).

I'm going to try building a custom PyEnumValue that will dynamically
grow the buffer size when that error occurs.  I'll report back on how
that works out for the end user.

In the meantime, I'm open to other theories on what might cause
RegEnumValue to fail with that error.

The end user is running Vista, if it matters.
msg66653 (view) Author: Martin v. Löwis (loewis) Date: 2008-05-11 19:21
I suggest to use regedit /e to dump the failing key into a file. That
should allow to reproduce it on a different system.
History
Date User Action Args
2008-05-11 19:21:48loewissetmessages: + msg66653
2008-05-11 18:39:25stutzbachsetmessages: + msg66649
2008-05-10 18:57:01loewissetmessages: + msg66563
2008-05-10 18:14:53stutzbachsetmessages: + msg66553
2008-05-10 17:52:44loewissetnosy: + loewis
messages: + msg66547
2008-05-10 17:37:33stutzbachcreate