This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Python 3's 2to3 does not handle non-ascii source files
Type: crash Stage:
Components: 2to3 (2.x to 3.x conversion tool) Versions: Python 3.1
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, zzzeek
Priority: normal Keywords:

Created on 2010-02-13 01:56 by zzzeek, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (6)
msg99298 - (view) Author: mike bayer (zzzeek) * Date: 2010-02-13 01:56
given the following Python 2 source file:

    # -*- encoding: utf-8

    print 'bien mangé'

It can be converted to Python 3 using 2's 2to3 tool:

    classic$ 2to3 test.py
     ... omitted ...
    --- test.py (original)
    +++ test.py (refactored)
    @@ -1,3 +1,3 @@
     # -*- encoding: utf-8
 
    -print 'bien mangé'
    +print('bien mangé')

However that of Python 3.1.1 fails:

    classic$ 2to3-3.1 test.py
       ... omitted ...
    --- test.py (original)
    +++ test.py (refactored)
    @@ -1,3 +1,3 @@
     # -*- encoding: utf-8
 
    Traceback (most recent call last):
      File "/usr/local/bin/2to3-3.1", line 6, in <module>
        sys.exit(main("lib2to3.fixes"))
      File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/lib2to3/main.py", line 159, in main
        options.processes)
      File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/lib2to3/refactor.py", line 616, in refactor
        items, write, doctests_only)
      File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/lib2to3/refactor.py", line 276, in refactor
        self.refactor_file(dir_or_file, write, doctests_only)
      File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/lib2to3/refactor.py", line 656, in refactor_file
        *args, **kwargs)
      File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/lib2to3/refactor.py", line 332, in refactor_file
        write=write, encoding=encoding)
      File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/lib2to3/refactor.py", line 432, in processed_file
        self.print_output(old_text, new_text, filename, equal)
      File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/lib2to3/main.py", line 64, in print_output
        print(line)
    UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 17: ordinal not in range(128)
msg99299 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-02-13 02:01
Please try 2to3 from 2.7a3 or the 2to3 trunk.
msg99300 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-02-13 02:02
Sorry, I meant from the py3k branch.
msg99301 - (view) Author: mike bayer (zzzeek) * Date: 2010-02-13 02:14
yes, its handled:

WARNING: couldn't encode test.py's diff for your terminal

is that fix specific to 2to3 or is that just how "print" works in 3.2 ?
msg99302 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-02-13 02:19
2010/2/12 mike bayer <report@bugs.python.org>:
>
> mike bayer <mike_mp@zzzcomputing.com> added the comment:
>
> yes, its handled:
>
> WARNING: couldn't encode test.py's diff for your terminal
>
> is that fix specific to 2to3 or is that just how "print" works in 3.2 ?

It's just that whatever python guesses your terminal encoding is (try
print(sys.stdin.encoding)) might not be able to handle whatever
characters are in the source file, so we just don't print it. You can
change the default encoding with the PYTHONIOENCODING env var.
msg99303 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-02-13 02:20
You can just use the --no-diffs option, btw, to avoid this problem on earlier versions.
History
Date User Action Args
2022-04-11 14:56:57adminsetgithub: 52170
2010-02-13 02:20:37benjamin.petersonsetmessages: + msg99303
2010-02-13 02:19:53benjamin.petersonsetstatus: open -> closed
resolution: out of date
2010-02-13 02:19:08benjamin.petersonsetmessages: + msg99302
2010-02-13 02:14:14zzzeeksetmessages: + msg99301
2010-02-13 02:02:01benjamin.petersonsetmessages: + msg99300
2010-02-13 02:01:29benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg99299
2010-02-13 01:56:54zzzeekcreate