Author amaury.forgeotdarc
Recipients Drekin, amaury.forgeotdarc, ezio.melotti
Date 2013-04-02.20:27:31
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1364934452.28.0.0488655949436.issue17588@psf.upfronthosting.co.za>
In-reply-to
Content
The issue is actually with compile():
  compile('x=1', '\u222b.py', 'exec')
fails on my Western Windows machine (mbcs = cp1252).
This conversion should not be necessary, since the filename is only used for error messages (and decoded again!)

But unfortunately the various API functions used by compile() are documented to take a filename encoded with the filesystem encoding:
http://docs.python.org/dev/c-api/veryhigh.html#Py_CompileStringExFlags
This API is unfortunate; on Windows Python should never have to convert filenames unless bytes strings are explicitly used.

I can see two ways to fix the issue:
- build another set of APIs which take unicode strings for the filename, or at least encoded to UTF-8.
- use some trick for unencodable filenames; filename.encode('mbcs', 'backslashreplace') works, but does not round-trip (and cannot fetch source code in tracebacks). I don't know if there is some variant of surrogateescape that we could use.
History
Date User Action Args
2013-04-02 20:27:32amaury.forgeotdarcsetrecipients: + amaury.forgeotdarc, ezio.melotti, Drekin
2013-04-02 20:27:32amaury.forgeotdarcsetmessageid: <1364934452.28.0.0488655949436.issue17588@psf.upfronthosting.co.za>
2013-04-02 20:27:32amaury.forgeotdarclinkissue17588 messages
2013-04-02 20:27:31amaury.forgeotdarccreate