This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: import + coding = failure (3.1.2/win32)
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.1
process
Status: closed Resolution: duplicate
Dependencies: Superseder: Full unicode import system
View: 3080
Assigned To: Nosy List: amaury.forgeotdarc, eric.araujo, gonegown, iritkatriel, vstinner
Priority: normal Keywords:

Created on 2010-06-13 10:50 by gonegown, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pybug-import-coding.zip gonegown, 2010-06-13 10:50 bug-generating sources 'a.py' & 'b.py'
Messages (21)
msg107731 - (view) Author: gonegown (gonegown) Date: 2010-06-13 10:50
I have python 3.1.2 fetched from the main site.

imagine two source files:

a.py:
-------
# coding: cp1251
import b;
print('A');
-------

b.py:
-------
print('B');
-------

Both reside in the same directory containing at least one non-ascii character (try 0xdb) in the _path_.

import will fail with an empty error!
#coding here works fine with utf-8 and fails using any other one

now tell me how the hell can file system encoding be related to file content encoding?!

I've attached the source
msg107783 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-06-14 11:50
This bug is maybe related to #8611?

Can you try with py3k (python 3.2)?
msg108184 - (view) Author: gonegown (gonegown) Date: 2010-06-19 14:47
Is there py3k for win32?
And how do I know if #8611 comes from the same source?
Have no idea how they have organized the python core. I'm new to python (about 2 months) and I don't think I will use it for long. It's just not serious.
msg108371 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-06-22 12:43
> now tell me how the hell can file system encoding be related 
> to file content encoding?!

Why do you say so? I can reproduce your issue, but changing the first line of a.py:
# coding: cp1252
to:
# coding: utf-8
did not change anything.

In the meantime, you should refrain from creating directories with characters not representable in the terminal window.

@haypo: The problem still exists with py3k at r82150.
msg108693 - (view) Author: gonegown (gonegown) Date: 2010-06-26 08:24
@Amaury:
What you're saying about directory naming is right indeed.
But the case has begun from cyrillic letters in the NTFS path, which I do not use, but the users of my soft do. So putting the program into such directory makes the former unuseable; until the sources are in utf anyway.

I just ran this on another computer and it seemed to work with #coding in a.py. I then added this line to b.py and it failed. I played about 15 minutes inserting the line and removing and changing the directory name.
And I can tell the behaviour for me looks just random!
Though I noticed that addding #coding line to both sources fails more often.

You'll see:
--------
Traceback (most recent call last):
  File "F:\1home\С\u201e\a.py", line 1, in <module>
SyntaxError: None
--------

And what the hell is this u201e? That should have been a letter!
msg108720 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-06-26 12:56
> File "F:\1home\С\u201e\a.py", line 1, in <module>
> And what the hell is this u201e? That should have been a letter!

It's probably this symbol: http://www.eki.ee/letter/chardata.cgi?ucode=201e
but it has no representation in the console windows you are using; try "import sys; print(sys.stderr.encoding)" to print the code page used by your console.
In error messages, Python replaces unpritable characters with their "escaped" form: \uXXXX where XXXX is the character number.
msg108721 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-06-26 13:00
> But the case has begun from cyrillic letters in the NTFS path, 
> which I do not use, but the users of my soft do. 
> So putting the program into such directory makes the former unuseable;
> until the sources are in utf anyway.

I agree that cyrillic letters in the path makes the program unusable. This is the bug to fix, and the zip file you attached is a good test case.

But I still don't see how "the sources are in utf" can influence this. Please show me!
msg108801 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-06-27 18:07
Did you read my first comment? "This bug is maybe related to #8611"
msg109167 - (view) Author: gonegown (gonegown) Date: 2010-07-03 09:07
@Amaury:
Removing #coding lines or replacing them with #coding: utf-8 makes this test case working, at least on 4 computers I have been able to test this.

My initial program was consisting of roughly ten files and utf-8 made it work.

@haypo:
"maybe"...
msg109168 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-07-03 09:19
Here is what I did, on a machine running Windows XP, with python 3.1.1:
- I used 7-zip to extract the attached zip file, in the c:\temp directory.
- Then I opened a command prompt, here is an exact copy of the session:
C:>cd \temp\█

C:\temp\█>dir
 Le volume dans le lecteur C s'appelle Disque dur
 Le numéro de série du volume est D4BA-260C

 Répertoire de C:\temp\█

03/07/2010  11:10    <REP>          .
03/07/2010  11:10    <REP>          ..
08/06/2010  09:13                44 a.py
08/06/2010  14:21                11 b.py
               2 fichier(s)               55 octets
               2 Rép(s)  58 733 801 472 octets libres

C:\temp\█>c:\Python31\python.exe a.py
Traceback (most recent call last):
  File "a.py", line 3, in <module>
    import b;
ImportError: No module named b

C:\temp\█>notepad a.py
[Replaced encoding with "utf-8", then save and quit]

C:\temp\█>c:\Python31\python.exe a.py
Traceback (most recent call last):
  File "a.py", line 2, in <module>
    import b;
ImportError: No module named b
msg109967 - (view) Author: gonegown (gonegown) Date: 2010-07-11 08:35
@Amaury:
Just fine!
It's either another bug in python or 3.1.1 specifics.
msg110047 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-07-11 21:10
> Just fine!
> It's either another bug in python or 3.1.1 specifics.

What do you mean? what is 'it'? The error I in the session above shows the bug we described first (strange letters in the path makes the program unusable), and shows that the file's encoding doesn't change this.
msg110420 - (view) Author: gonegown (gonegown) Date: 2010-07-16 09:00
@Amaury:
error message for my bug was:
SyntaxError: None
and for your:
ImportError: No module named b

We've got at least two bugs in one testcase
msg110421 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-07-16 09:07
Then please tell us how to reproduce the "SyntaxError" case
msg112029 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-07-30 00:21
I wrote a patch on import machinery to support unicode characters: see #9425.
msg112034 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-07-30 00:49
I tested pybug-import-coding.zip on Windows with my import_unicode branch (see #9425): it works correctly, whereas it fails with py3k (Python 3.2).
msg119105 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-10-19 02:14
I tested the example with Python 3.2 (r85691) and the issue looks to be fixed. Can someone else confirm that?

I decompressed the ZIP archive, moved into the directory containing a.py and b.py, and called \path\to\python.exe a.py: it works.
msg119106 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-10-19 02:14
See also #3080 for the full unicode support in the import machinery.
msg227781 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-09-28 23:15
Works for me using 3.4.1 and 3.5.0a0.
msg380461 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2020-11-06 17:53
I think this can be closed as out of date.
msg380462 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-06 18:01
> See also #3080 for the full unicode support in the import machinery.

I'm confident that I fixed this issue in bpo-3080. I mark this one as a duplicate.
History
Date User Action Args
2022-04-11 14:57:02adminsetgithub: 53234
2020-11-06 18:01:17vstinnersetstatus: pending -> closed
superseder: Full unicode import system
messages: + msg380462

resolution: duplicate
stage: test needed -> resolved
2020-11-06 17:53:10iritkatrielsetstatus: open -> pending
nosy: + iritkatriel
messages: + msg380461

2019-04-26 20:37:26BreamoreBoysetnosy: - BreamoreBoy
2014-09-28 23:15:50BreamoreBoysetnosy: + BreamoreBoy
messages: + msg227781
2010-10-19 04:49:58eric.araujosetnosy: + eric.araujo
2010-10-19 02:14:59vstinnersetmessages: + msg119106
2010-10-19 02:14:18vstinnersetmessages: + msg119105
2010-07-30 00:49:57vstinnersetmessages: + msg112034
2010-07-30 00:21:48vstinnersetmessages: + msg112029
2010-07-16 09:07:07amaury.forgeotdarcsetmessages: + msg110421
2010-07-16 09:00:56gonegownsetmessages: + msg110420
2010-07-11 21:10:28amaury.forgeotdarcsetmessages: + msg110047
2010-07-11 08:35:51gonegownsetmessages: + msg109967
2010-07-03 09:19:04amaury.forgeotdarcsetmessages: + msg109168
2010-07-03 09:07:31gonegownsetmessages: + msg109167
2010-06-27 18:07:11vstinnersetmessages: + msg108801
2010-06-26 13:00:18amaury.forgeotdarcsetmessages: + msg108721
2010-06-26 12:56:22amaury.forgeotdarcsetmessages: + msg108720
2010-06-26 08:24:37gonegownsetmessages: + msg108693
2010-06-22 12:43:51amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg108371
2010-06-19 14:47:56gonegownsetmessages: + msg108184
2010-06-14 11:50:13vstinnersetmessages: + msg107783
2010-06-14 05:13:18r.david.murraysettype: crash -> behavior
components: + Interpreter Core, - None
stage: test needed
2010-06-14 05:12:43r.david.murraysetnosy: + vstinner
2010-06-13 10:50:33gonegowncreate