Issue23315
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015-01-25 11:02 by akira, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Messages (8) | |||
---|---|---|---|
msg234662 - (view) | Author: Akira Li (akira) * | Date: 2015-01-25 11:02 | |
Python 2.7.9 (default, Jan 25 2015, 13:41:30) [GCC 4.9.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import os, sys, tempfile >>> d = u'\u20ac'.encode(sys.getfilesystemencoding()) # non-ascii >>> if not os.path.isdir(d): os.makedirs(d) ... >>> os.environ['TEMP'] = d >>> tempfile.mkdtemp(prefix=u'') Traceback (most recent call last): File "<stdin>", line 1, in <module> File ".../python2.7/tempfile.py", line 331, in mkdtemp file = _os.path.join(dir, prefix + name + suffix) File ".../python2.7/posixpath.py", line 80, in join path += '/' + b UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 13: ordinal not in range(128) Related: https://bugs.python.org/issue1681974 |
|||
msg234664 - (view) | Author: STINNER Victor (vstinner) * | Date: 2015-01-25 12:05 | |
Why do you use an unicode prefix? Does it work with a bytes prefix? You should use Python 3 if you want the best Unicode support. |
|||
msg257333 - (view) | Author: Richard PALO (risto3) | Date: 2016-01-02 07:42 | |
I notice similar problems, as found when running the test suite for lxml 3.5.0 on python2.7 ====================================================================== ERROR: test_etree_parse_io_error (lxml.tests.test_io.ETreeIOTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/opt/local/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/tmp/pkgsrc/textproc/py-lxml/work/lxml-3.5.0/src/lxml/tests/test_io.py", line 276, in test_etree_parse_io_error dn = tempfile.mkdtemp(prefix=dirnameRU) File "/opt/local/lib/python2.7/tempfile.py", line 339, in mkdtemp _os.mkdir(file, 0700) UnicodeEncodeError: 'ascii' codec can't encode characters in position 40-53: ordinal not in range(128) ====================================================================== ERROR: test_etree_parse_io_error (lxml.tests.test_io.ElementTreeIOTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/opt/local/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/tmp/pkgsrc/textproc/py-lxml/work/lxml-3.5.0/src/lxml/tests/test_io.py", line 276, in test_etree_parse_io_error dn = tempfile.mkdtemp(prefix=dirnameRU) File "/opt/local/lib/python2.7/tempfile.py", line 339, in mkdtemp _os.mkdir(file, 0700) UnicodeEncodeError: 'ascii' codec can't encode characters in position 40-53: ordinal not in range(128) the code snippet is in test_io.py", line 276 266 def test_etree_parse_io_error(self): 267 # this is a directory name that contains characters beyond latin-1 268 dirnameEN = _str('Directory') 269 dirnameRU = _str('КÐ\260Ñ\032Ð\260Ð\273Ð\276Ð\263') 270 filename = _str('nosuchfile.xml') 271 dn = tempfile.mkdtemp(prefix=dirnameEN) 272 try: 273 self.assertRaises(IOError, self.etree.parse, os.path.join(dn, filename)) 274 finally: 275 os.rmdir(dn) 276 dn = tempfile.mkdtemp(prefix=dirnameRU) 277 try: 278 self.assertRaises(IOError, self.etree.parse, os.path.join(dn, filename)) 279 finally: 280 os.rmdir(dn) even if I change dirnameRU to a simple French 'Répertoire' I still get errors... It is not an option to upgrade to 3.0, sorry. BTW, I tried passing dirnameRU.encode('utf-8') but that just generates a different error: ERROR: test_etree_parse_io_error (lxml.tests.test_io.ETreeIOTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/opt/local/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/tmp/pkgsrc/textproc/py-lxml/work/lxml-3.5.0/src/lxml/tests/test_io.py", line 278, in test_etree_parse_io_error self.assertRaises(IOError, self.etree.parse, os.path.join(dn, filename)) File "/opt/local/lib/python2.7/posixpath.py", line 73, in join path += '/' + b UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 40: ordinal not in range(128) |
|||
msg257334 - (view) | Author: Richard PALO (risto3) | Date: 2016-01-02 07:58 | |
If I also add .encode('utf-8') to filename on line 278, that seems gets over the pathname problem. I guess it comes down to the fact that if sys.filesystemencoding() is utf-8, which in my case it is (on SunOS), I believe these conversion should be automatic. |
|||
msg257338 - (view) | Author: Richard PALO (risto3) | Date: 2016-01-02 08:59 | |
curiously enough, I was able to test with python3.5. The same errors result, and the same workaround seems to get over it. |
|||
msg257340 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2016-01-02 09:37 | |
The similar problem in Python 3 was addressed in issue24230. But this was a new feature. As for lxml tests, I suggest to use bytes names compatible with all Windows OEM encodings (consisting of ASCII + b'\xa9\xb0\xb2\xb3\xb4\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc8\xc9\xe6\xf0\xf1\xf3\xf4\xf5\xf6\xf7') and with UTF-8. |
|||
msg257342 - (view) | Author: Richard PALO (risto3) | Date: 2016-01-02 10:28 | |
This turns out to be related to the locale environment set to 'C'. A UTF-8 locale seems to get over the issue. A fellow pkgsrc colleague filed an issue with lxml already relating to that fact for the test suite (https://bugs.launchpad.net/lxml/+bug/1522052) cheers |
|||
msg370480 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2020-05-31 15:11 | |
Python 2.7 is no longer supported. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:12 | admin | set | github: 67504 |
2020-05-31 15:11:08 | serhiy.storchaka | set | status: open -> closed resolution: out of date messages: + msg370480 stage: resolved |
2016-01-02 10:28:34 | risto3 | set | messages: + msg257342 |
2016-01-02 09:37:48 | serhiy.storchaka | set | nosy:
+ scoder, gregory.p.smith, serhiy.storchaka messages: + msg257340 |
2016-01-02 08:59:45 | risto3 | set | messages: + msg257338 |
2016-01-02 07:58:22 | risto3 | set | messages: + msg257334 |
2016-01-02 07:42:24 | risto3 | set | nosy:
+ risto3 messages: + msg257333 |
2015-01-25 12:05:10 | vstinner | set | messages: + msg234664 |
2015-01-25 11:02:16 | akira | create |