This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author loewis
Recipients loewis, pitrou, valhallasw, vstinner
Date 2010-10-30.21:01:17
SpamBayes Score 3.4015166e-09
Marked as misclassified No
Message-id <1288472480.54.0.280490517821.issue10254@psf.upfronthosting.co.za>
In-reply-to
Content
The change from issue1054943 is indeed bogus. As written, the code will happily run over starters, even though a blocked start means that subsequent characters can't possibly be combinable. That way, the code manages to combine, in 'Li\u030dt-s\u1e73\u0301', the final U+0301 with the i - even though there are several starters in-between.

I think the code should work like this:

if comb!=0 and comb1==0:
  #starter after character with higher class:
  # not combinable, and all subsequent characters will be blocked
  # as well
  break
if comb!=0 and comb1==comb:
  # blocked combining character, continue searching
  i1++
  continue
# candidate pair, check whether *i and *i1 are combinable

It's unfortunate that the patch had been backported to 2.6.6; we can't fix it there anymore.
History
Date User Action Args
2010-10-30 21:01:20loewissetrecipients: + loewis, pitrou, vstinner, valhallasw
2010-10-30 21:01:20loewissetmessageid: <1288472480.54.0.280490517821.issue10254@psf.upfronthosting.co.za>
2010-10-30 21:01:18loewislinkissue10254 messages
2010-10-30 21:01:18loewiscreate