Use correct encoding for printing SyntaxErrors #40933

atsuoishimoto · 2004-09-20T13:37:30Z

BPO	1031213
Nosy	@malemburg, @gvanrossum, @loewis, @atsuoishimoto
Files	parsetok.patch 1031213.patch display_exception.py

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/loewis'
closed_at = <Date 2007-09-04.14:23:45.464>
created_at = <Date 2004-09-20.13:37:30.000>
labels = ['interpreter-core']
title = 'Use correct encoding for printing SyntaxErrors'
updated_at = <Date 2007-11-15.20:40:02.938>
user = 'https://github.com/atsuoishimoto'

bugs.python.org fields:

activity = <Date 2007-11-15.20:40:02.938>
actor = 'gvanrossum'
assignee = 'loewis'
closed = True
closed_date = <Date 2007-09-04.14:23:45.464>
closer = 'loewis'
components = ['Interpreter Core']
creation = <Date 2004-09-20.13:37:30.000>
creator = 'ishimoto'
dependencies = []
files = ['6255', '6256', '8508']
hgrepos = []
issue_num = 1031213
keywords = ['patch']
message_count = 20.0
messages = ['46912', '46913', '46914', '46915', '46916', '46917', '46918', '55641', '55642', '56319', '56334', '56335', '56338', '56339', '56340', '56347', '56435', '56445', '57519', '57558']
nosy_count = 5.0
nosy_names = ['lemburg', 'gvanrossum', 'loewis', 'nnorwitz', 'ishimoto']
pr_nums = []
priority = 'high'
resolution = 'accepted'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue1031213'
versions = ['Python 2.6', 'Python 2.5']

atsuoishimoto · 2004-09-20T13:37:30Z

When SyntaxError occurs and the module contains
source encodings definition, current implementation
prints error line in UTF8. This patch reverts the line into
original encoding for printing.

This patch calls some memory-allocation APIs such as
PyUnicode_DecodeUTF8. I'm not sure I can (or should)
call PyErr_Clear() here if error happened.

nnorwitz · 2005-10-02T05:45:26Z

Logged In: YES
user_id=33168

I'm hoping that someone more familiar with unicode could
take a look at this. The patch looks ok to me, but I
don't know how to test that it works. I'm inclined to accept
it, unless I hear otherwise.

malemburg · 2005-10-02T18:08:59Z

Logged In: YES
user_id=38388

Please use the "replace" error handler when recoding the
source line
to Unicode - this will reduce the probability of the
conversion failing.

If you do get an error, it's likely going to be an unknown
encoding or
less likely a memory problem. Please add some logic to deal
with these
errors as well - currently you don't call PyError_Clear() or
take some
other action which may lead to confusing error reports (e.g.
error
popping up randomly during program execution due to the set
error).

atsuoishimoto · 2005-10-13T06:38:48Z

Logged In: YES
user_id=463672

Thanks for your comments. I'll post a revised patch and test
case later.

atsuoishimoto · 2006-03-18T07:06:07Z

Logged In: YES
user_id=463672

Sorry for my laziness. I revised a patch for current trunk.

Use "replace" for recoding source
Reports error with PyErr_Print()
Test case

nnorwitz · 2006-07-30T17:04:01Z

Logged In: YES
user_id=33168

Note to self (or anyone interested): remember to look into this.

gvanrossum · 2007-07-16T20:35:38Z

I think Martin von Loewis knows more about this.

loewis · 2007-09-04T14:23:45Z

Thanks for the patch. It wouldn't work as-is, because it broke PGEN. I
fixed that, and committed the change as r57961 and r57962.

gvanrossum · 2007-09-04T14:46:11Z

We should make sure this is *not* merged into Py3k; there, things remain
unicode until they're printed, at which point the only encoding that
matters is the output file's encoding.

loewis · 2007-10-10T18:58:06Z

ishimoto: in dec_utf8, there is a PyErr_Print call. What is the purpose
of this call?

atsuoishimoto · 2007-10-11T01:54:49Z

PyErr_Print() is called to report exception raised by codec.
If PyUnicode_DecodeUTF8() or PyUnicode_AsEncodedString() return NULL,
PyErr_Print() is called.

gvanrossum · 2007-10-11T02:20:07Z

PyErr_Print() is called to report exception raised by codec.
If PyUnicode_DecodeUTF8() or PyUnicode_AsEncodedString() return NULL,
PyErr_Print() is called.

This comment is not very helpful; it describes what happens, but not
why, or whether that is a good idea. I believe that if this call is
ever reached, two tracebacks will be printed, confusing the user.

atsuoishimoto · 2007-10-11T06:04:21Z

Sorry for insufficient comment.

When a codec raised an exception, I think the exception should be
reported. Otherwise, user cannot know why Python prints broken line
of code.

Should we silently clear the exception raised by codecs, or print a
message such as "Codec raised an exception while processing compile
error." ?

loewis · 2007-10-11T06:10:00Z

Should we silently clear the exception raised by codecs, or print a
message such as "Codec raised an exception while processing compile
error." ?

Can you create a test case that triggers that specific problem?

Regards,
Martin

atsuoishimoto · 2007-10-11T07:44:29Z

Codecs would hardly ever raises exception here.
Usually, exception raised here would be a MemoryError. The unicode
string we are trying to encode is just decoded by same codec. If codec
raises exception other than MemoryError, the codec will likely have problem.

I attached a script to print exception raised by codec. I wrote a "buggy"
encoder, which never return string but raises an exception.

gvanrossum · 2007-10-11T16:53:28Z

There are tons of situations where such an exception will be
suppressed, ofr better or for worse. I don't think this one deserves
such a radical approach.

On 10/11/07, atsuo ishimoto <report@bugs.python.org> wrote:

atsuo ishimoto added the comment:

Codecs would hardly ever raises exception here.
Usually, exception raised here would be a MemoryError. The unicode
string we are trying to encode is just decoded by same codec. If codec
raises exception other than MemoryError, the codec will likely have problem.

I attached a script to print exception raised by codec. I wrote a "buggy"
encoder, which never return string but raises an exception.

Tracker <report@bugs.python.org>
<http://bugs.python.org/issue1031213\>

atsuoishimoto · 2007-10-15T08:13:01Z

That's fine with me. Please replace PyErr_Print() with PyErr_Clear().

gvanrossum · 2007-10-15T15:54:33Z

atsuo ishimoto added the comment:

That's fine with me. Please replace PyErr_Print() with PyErr_Clear().

Done.

Committed revision 58471.

atsuoishimoto · 2007-11-15T05:08:42Z

In release25-maint, PyErr_Print() should be replaced with PyErr_Clear()
also.

gvanrossum · 2007-11-15T20:40:03Z

In release25-maint, PyErr_Print() should be replaced with
PyErr_Clear() also.

Committed revision 58991.

atsuoishimoto mannequin assigned loewis Sep 20, 2004

atsuoishimoto mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Sep 20, 2004

atsuoishimoto mannequin assigned loewis Sep 20, 2004

birkenfeld changed the title ~~Patch for bug #780725~~ Use correct encoding for printing SyntaxErrors Aug 23, 2007

loewis mannequin closed this as completed Sep 4, 2007

ezio-melotti transferred this issue from another repository Apr 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use correct encoding for printing SyntaxErrors #40933

Use correct encoding for printing SyntaxErrors #40933

atsuoishimoto mannequin commented Sep 20, 2004

atsuoishimoto mannequin commented Sep 20, 2004

nnorwitz mannequin commented Oct 2, 2005

malemburg commented Oct 2, 2005

atsuoishimoto mannequin commented Oct 13, 2005

atsuoishimoto mannequin commented Mar 18, 2006

nnorwitz mannequin commented Jul 30, 2006

gvanrossum commented Jul 16, 2007

loewis mannequin commented Sep 4, 2007

gvanrossum commented Sep 4, 2007

loewis mannequin commented Oct 10, 2007

atsuoishimoto mannequin commented Oct 11, 2007

gvanrossum commented Oct 11, 2007

atsuoishimoto mannequin commented Oct 11, 2007

loewis mannequin commented Oct 11, 2007

atsuoishimoto mannequin commented Oct 11, 2007

gvanrossum commented Oct 11, 2007

atsuoishimoto mannequin commented Oct 15, 2007

gvanrossum commented Oct 15, 2007

atsuoishimoto mannequin commented Nov 15, 2007

gvanrossum commented Nov 15, 2007

Use correct encoding for printing SyntaxErrors #40933

Use correct encoding for printing SyntaxErrors #40933

Comments

atsuoishimoto mannequin commented Sep 20, 2004

atsuoishimoto mannequin commented Sep 20, 2004

nnorwitz mannequin commented Oct 2, 2005

malemburg commented Oct 2, 2005

atsuoishimoto mannequin commented Oct 13, 2005

atsuoishimoto mannequin commented Mar 18, 2006

nnorwitz mannequin commented Jul 30, 2006

gvanrossum commented Jul 16, 2007

loewis mannequin commented Sep 4, 2007

gvanrossum commented Sep 4, 2007

loewis mannequin commented Oct 10, 2007

atsuoishimoto mannequin commented Oct 11, 2007

gvanrossum commented Oct 11, 2007

atsuoishimoto mannequin commented Oct 11, 2007

loewis mannequin commented Oct 11, 2007

atsuoishimoto mannequin commented Oct 11, 2007

gvanrossum commented Oct 11, 2007

atsuoishimoto mannequin commented Oct 15, 2007

gvanrossum commented Oct 15, 2007

atsuoishimoto mannequin commented Nov 15, 2007

gvanrossum commented Nov 15, 2007