classification
Title: SSL module cannot handle unicode filenames
Type: behavior Stage:
Components: Library (Lib), Unicode Versions: Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, giampaolo.rodola, hynek, janssen, loewis, pitrou, schlamar, terry.reedy
Priority: normal Keywords:

Created on 2012-05-25 06:49 by schlamar, last changed 2012-06-05 19:08 by schlamar. This issue is now closed.

Messages (10)
msg161554 - (view) Author: Marc Schlaich (schlamar) * Date: 2012-05-25 06:49
Here is a short example to reproduce the error:

>>> import socket, ssl
>>> sock = socket.socket()
>>> sock = ssl.wrap_socket(sock, cert_reqs=ssl.CERT_REQUIRED, ca_certs=u'ä.crt')
>>> sock.connect((None, None))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\ssl.py", line 322, in connect
    self._real_connect(addr, False)
  File "C:\Python27\lib\ssl.py", line 305, in _real_connect
    self.ca_certs, self.ciphers)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 0:
ordinal not in range(128)
msg161560 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012-05-25 09:55
Seems to work fine in Python 3.2+.

Two possibilities:

 1. document ca_certs is str only
 2. encode with sys.getfilesystemencoding() if unicode

Would have to be fixed in ssl.get_server_certificate too and maybe even more, I did just a quick glance.
msg161642 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-05-26 03:12
There are other paramaters that take optional 'files'. Whatever change is made should be done uniformly for all.

'File' is unfortunately vague, as it could mean file object or file name or both. If file name, it could be str only or (for 2.7), str and unicode. I an not sure of what the 2.7 standard is, if there is one.

Allowing unicode could be seen as an enhancement, but it depends on the original intention and/or default 2.7 interpretation of 'file'.
msg161649 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-26 07:26
Indeed, it would probably be a new feature to accept unicode there, and we don't add new features to 2.7.
As Hynek said, Python 3 is fine.
msg161660 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012-05-26 10:41
So are we going to add something to the docs or just close as rejected?
msg161662 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-26 11:37
Honestly, I'm not sure it is worth documenting. How to use the ca_certs argument is clear when reading the examples further in the ssl doc page, and detailing the quirks of each and every argument would make the text much less readable.
msg161663 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012-05-26 12:01
I'd just add some general catch-all phrase at the top that all paths are expected to be encoded strings.
msg162295 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-06-04 21:11
I'm closing it as won't fix. I don't think it needs to be documented, but I won't mind if it is.
msg162363 - (view) Author: Marc Schlaich (schlamar) * Date: 2012-06-05 18:49
Well, the Unicode HOWTO states:

When opening a file for reading or writing, you can usually just provide the Unicode string as the filename, and it will be automatically converted to the right encoding for you

This is really an unexpected behavior which could be easily missed by a test case so I would really vote for making this clear in the documentation.
msg162365 - (view) Author: Marc Schlaich (schlamar) * Date: 2012-06-05 19:08
For example it is broken in the well known requests library:

>>> import requests
>>> requests.get('x', cert=u'öäü.pem')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  ...
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
History
Date User Action Args
2012-06-05 19:08:01schlamarsetmessages: + msg162365
2012-06-05 18:49:04schlamarsetmessages: + msg162363
2012-06-04 21:11:05loewissetstatus: open -> closed

nosy: + loewis
messages: + msg162295

resolution: wont fix
2012-05-26 12:01:29hyneksetmessages: + msg161663
2012-05-26 11:37:39pitrousetmessages: + msg161662
2012-05-26 10:41:55hyneksetmessages: + msg161660
2012-05-26 07:26:06pitrousetmessages: + msg161649
2012-05-26 03:12:47terry.reedysetnosy: + terry.reedy, janssen, pitrou, giampaolo.rodola
messages: + msg161642
2012-05-25 09:55:11hyneksetversions: - Python 2.6
nosy: + hynek

messages: + msg161560

type: crash -> behavior
2012-05-25 06:49:09schlamarcreate