Title: python -m imports non-ASCII .py file without encoding declaration
Components: Interpreter Core Versions: Python 3.3, Python 3.4
Assigned To: Nosy List: benjamin.peterson, eryksun, iritkatriel, jwilk, loewis, ncoghlan, r.david.murray, serhiy.storchaka, vstinner
Created on 2013-12-10 11:04 by jwilk, last changed 2022-04-11 14:57 by admin. This issue is now closed.

msg205787 - (view) Author: Jakub Wilk (jwilk) Date: 2013-12-10 11:04
If you have a non-ASCII .py file without encoding declaration, then you can't normally import it:

$ python --version
Python 2.7.6

$ python -c 'import test'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "", line 1
SyntaxError: Non-ASCII character '\xc2' in file on line 1, but no encoding declared; see for details

However, "python -m" happily imports such files:

$ python -m test
¡Hello world!
msg205820 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-12-10 14:56
Well, we aren't going to change 2.7 to have this case start throwing an error, since someone may be depending on it.  So I'm not sure there's anything to do here.
msg205824 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-10 15:16
There are several bugs in processing encoding declaration (issue18961, issue18873) and yet several were fixed last time. Perhaps this issue or issue19942 relate to them.
msg205865 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-12-10 22:49
Quite plausibly a bug in the pkgutil import system emulation. Does the same inconsistency appear in 3.3 with non-UTF8 bytes?
msg205878 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013-12-11 02:33
3.x is afflicted.
msg407386 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-11-30 17:04
On 3.11 both are working (on a Mac):

cpython-1 % python -m tt  
¡Hello world!
cpython-1 % ./python.exe -c 'import tt'
¡Hello world!
msg407414 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-11-30 23:55 is a UTF-8 file, which is the default source encoding in Python 3. It fails as expected if the test script is encoded differently, such as Latin-1, unless the source encoding is declared.
msg407419 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-12-01 00:45
I confirm: Python 3.10 works as expected.

Python 3.10 fails with the same SyntaxError using "python" or "python -m script" if the script contains non-ASCII characters but is not encoded to UTF-8.

vstinner@apu$ python3
  File "/home/vstinner/", line 1
    print('�Hello world!')
SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0xa1 in position 0: invalid start byte

vstinner@apu$ python3 -m test
Traceback (most recent call last):
  File "/home/vstinner/", line 1
    print('�Hello world!')
SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0xa1 in position 0: invalid start byte
