This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author terry.reedy
Recipients terry.reedy
Date 2012-01-11.03:46:42
SpamBayes Score 0.0
Marked as misclassified No
Message-id <1326253604.11.0.305060780894.issue13758@psf.upfronthosting.co.za>
In-reply-to
Content
The 3.2.2 doc for compile() says "The filename argument should give the file from which the code was read; pass some recognizable value if it wasn’t read from a file ('<string>' is commonly used)."

I am not sure what 'recognizable' is supposed to mean, but as I understand it, it would be user-specific and any string containing a fake 'filename' should be accepted and attached to the output code object as the .co_filename attribute. (At least on Windows.)

In fact, compile() has a hidden restriction: it encodes 'filename' with the local filesystem encoding. It tosses the bytes result (at least on Windows) but lets a UnicodeEncodeError terminate compilation. The effect is to add an undocumented and spurious dependency to code that has nothing to do with real files or the local machine.

In #10114, msg118845, Victor Stinner justified this with 
"co_filename attribute is used to display the traceback: Python opens the related file, read the source code line and display it."
If the filename is fake, it cannot do that. (Perhaps the doc should warn users to make sure that fake filenames do not match any possibly real filenames ;-). The traceback mechanism could ignore UnicodeEncodeErrors just as well as it now ignores IO(?)Errors when open('fakename') does not not work.

Victor continues "On Windows, co_filename is directly used because Windows accepts unicode for filenames." This is not true in that on at least some Windows, compile tries to encode with the mbcs codec, which in turn uses the hidden local codepage. I believe that for most or all codepages, this will even raise errors for some valid Unicode filenames.

I do not know whether the stored .co_filename attribute type for *nix is str, as on Windows, or bytes. If the latter, the doc should say so.
If compile() continues to filter fake filenames, which I oppose, the doc should also say so and say what it does.

This issue came up on python-list when someone used a Chinese filename and mbcs rejected it.
History
Date User Action Args
2012-01-11 03:46:44terry.reedysetrecipients: + terry.reedy
2012-01-11 03:46:44terry.reedysetmessageid: <1326253604.11.0.305060780894.issue13758@psf.upfronthosting.co.za>
2012-01-11 03:46:43terry.reedylinkissue13758 messages
2012-01-11 03:46:42terry.reedycreate