This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Neil.Hodgson
Recipients Neil.Hodgson, ethan.furman, ezio.melotti, georg.brandl, pitrou, serhiy.storchaka, vstinner
Date 2013-04-03.22:54:56
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1365029696.53.0.503821886831.issue17615@psf.upfronthosting.co.za>
In-reply-to
Content
For 32-bit Windows, the code generated for unicode_compare is quite slow.

    There are either 1 or 2 kind checks in each call to PyUnicode_READ and 2 calls to PyUnicode_READ inside the loop. A compiler may decide to move the kind checks out of the loop and specialize the loop but MSVC 2010 appears to not do so. The assembler (32-bit build) for each PyUnicode_READ looks like

    mov    ecx, DWORD PTR _kind1$[ebp]
    cmp    ecx, 1
    jne    SHORT $LN17@unicode_co@2
    lea    ecx, DWORD PTR [ebx+eax]
    movzx    edx, BYTE PTR [ecx+edx]
    jmp    SHORT $LN16@unicode_co@2
$LN17@unicode_co@2:
    cmp    ecx, 2
    jne    SHORT $LN15@unicode_co@2
    movzx    edx, WORD PTR [ebx+edi]
    jmp    SHORT $LN16@unicode_co@2
$LN15@unicode_co@2:
    mov    edx, DWORD PTR [ebx+esi]
$LN16@unicode_co@2:

   The kind1/kind2 variables aren't even going into registers and at least one test+branch and a jump are executed for every character. Two tests for 2 and 4 byte kinds. len1 and len2 don't get to go into registers either.

   My system isn't set up for 64-bit MSVC 2010 but looking at the code from 64-bit MSVC 2012 shows that all the variables have been moved into registers but the kind checking is still inside the loop. This accounts for better results with 64-bit Python 3.3 on Windows but isn't as good as Unix or Python 3.2.

; 10431:         c1 = PyUnicode_READ(kind1, data1, i);

	cmp	rsi, 1
	jne	SHORT $LN17@unicode_co
	lea	rax, QWORD PTR [r9+rcx]
	movzx	r8d, BYTE PTR [rax+rbx]
	jmp	SHORT $LN16@unicode_co
$LN17@unicode_co:
	cmp	rsi, 2
	jne	SHORT $LN15@unicode_co
	movzx	r8d, WORD PTR [r9+r11]
	jmp	SHORT $LN16@unicode_co
$LN15@unicode_co:
	mov	r8d, DWORD PTR [r9+r10]
$LN16@unicode_co:

   Attached the 32-bit assembler listing.
History
Date User Action Args
2013-04-03 22:54:56Neil.Hodgsonsetrecipients: + Neil.Hodgson, georg.brandl, pitrou, vstinner, ezio.melotti, ethan.furman, serhiy.storchaka
2013-04-03 22:54:56Neil.Hodgsonsetmessageid: <1365029696.53.0.503821886831.issue17615@psf.upfronthosting.co.za>
2013-04-03 22:54:56Neil.Hodgsonlinkissue17615 messages
2013-04-03 22:54:56Neil.Hodgsoncreate