This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author bmispelon
Recipients bmispelon
Date 2012-10-09.10:23:53
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1349778233.78.0.111302090144.issue16173@psf.upfronthosting.co.za>
In-reply-to
Content
When a syntax error happens, the exception that gets printed has an extra line with a caret that helps locate the error.

If the line also contains an identifier with non-ascii characters, then this caret is misaligned (too far on the right).

I've investigated briefly and it seems that the offset attribute on the SyntaxError has a wrong value:

    for varname in ['a', 'é', '蟒']: # 1, 2 and 3 bytes
        try:
            exec("%s$" % varname) # SyntaxError
        except SyntaxError as e:
            print(e.offset) # should be 2

The example above prints 2, 3, and 4 when it should be printing 2 every time.

It seems that the calculation of the offset takes into account the size in bytes instead of the size in characters.

I've tested and reproduced the issue on 3.2.2 and on a recent clone of the mercurial repository (dd5e98ddcd39).
History
Date User Action Args
2012-10-09 10:23:53bmispelonsetrecipients: + bmispelon
2012-10-09 10:23:53bmispelonsetmessageid: <1349778233.78.0.111302090144.issue16173@psf.upfronthosting.co.za>
2012-10-09 10:23:53bmispelonlinkissue16173 messages
2012-10-09 10:23:53bmispeloncreate