This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients vstinner
Date 2010-10-15.12:34:46
SpamBayes Score 0.0004411035
Marked as misclassified No
Message-id <1287146091.4.0.492257332187.issue10114@psf.upfronthosting.co.za>
In-reply-to
Content
Example:

$ ./python
Python 3.2a3+ (py3k, Oct 15 2010, 14:31:59) 
>>> compile('', 'abc\uDC80', 'exec')
...
UnicodeEncodeError: 'utf-8' codec can't encode character '\udc80' in position 3: surrogates not allowed

Attached patch encodes manually the filename to utf-8 with surrogateescape.

I found this problem while testing Python with an ASCII locale encoding (LANG=C ./python Lib/test/regrtest.py). Example:

  $ LANG=C ./python -m base64 -e setup.py 
  ...
  UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' ...
History
Date User Action Args
2010-10-15 12:34:51vstinnersetrecipients: + vstinner
2010-10-15 12:34:51vstinnersetmessageid: <1287146091.4.0.492257332187.issue10114@psf.upfronthosting.co.za>
2010-10-15 12:34:50vstinnerlinkissue10114 messages
2010-10-15 12:34:49vstinnercreate