classification
Title: strftime fails in non UTF-8 locale
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 3.0, Python 3.1
process
Status: closed Resolution: duplicate
Dependencies: Superseder: time.strftime() always decodes result with UTF-8
View: 3061
Assigned To: Nosy List: barry-scott, ezio.melotti, loewis, pitrou
Priority: high Keywords:

Created on 2009-05-02 13:19 by barry-scott, last changed 2009-05-29 16:37 by loewis. This issue is now closed.

Messages (8)
msg86944 - (view) Author: Barry Alan Scott (barry-scott) * Date: 2009-05-02 13:19
On Mac OS X 10.5

$ LC_ALL=ru_RU.koi8-r python3.0 -c 'import time;print( time.strftime("%A"))'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-1:
invalid data
msg86945 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2009-05-02 13:22
See http://bugs.python.org/issue5398
msg86947 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2009-05-02 13:35
Here the issue might be different though. Does 
$ LC_ALL=ru_RU.koi8-r python3.0 -c 'import time;time.strftime("%A")'
(without the print) work?

I don't have the ru_RU locale but here time.strftime() return 'str', not
'bytes' and the utf-8 codec should be able to encode it:
>>> time.strftime("%A")
'Saturday'
>>> type(_)
<class 'str'>
msg86948 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2009-05-02 14:16
I was able to reproduce this using an italian locale on Windows:
>>> locale.setlocale(locale.LC_TIME, 'Italian_Italy.1252')
'Italian_Italy.1252'
>>> time.strftime("%A", time.strptime("2009-05-01", "%Y-%m-%d"))
'venerd?'

That should be 'venerdì'.
I also found http://bugs.python.org/issue3061 and
http://bugs.python.org/issue836035 that seem to be related. (#5398
instead doesn't seem to be related.)

Apparently on Py3.x a unicode string ('str') is returned, whereas Py2.x
returns an encoded string:
>>> time.strftime("%A", time.strptime("2009-05-01", "%Y-%m-%d"))
'venerd\xec'
msg86950 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-05-02 15:08
Same thing here (Linux) with a non-utf8 locale:

>>> locale.setlocale(locale.LC_TIME, "fr_FR.UTF-8")
'fr_FR.UTF-8'
>>> time.strftime("%B", time.strptime("2009-12-01", "%Y-%m-%d"))
'décembre'
>>> locale.setlocale(locale.LC_TIME, "fr_FR.ISO8859-15")
'fr_FR.ISO8859-15'
>>> time.strftime("%B", time.strptime("2009-12-01", "%Y-%m-%d"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antoine/py3k/__svn__/Lib/_strptime.py", line 461, in
_strptime_time
    return _strptime(data_string, format)[0]
  File "/home/antoine/py3k/__svn__/Lib/_strptime.py", line 307, in _strptime
    _TimeRE_cache = TimeRE()
  File "/home/antoine/py3k/__svn__/Lib/_strptime.py", line 188, in __init__
    self.locale_time = LocaleTime()
  File "/home/antoine/py3k/__svn__/Lib/_strptime.py", line 72, in __init__
    self.__calc_month()
  File "/home/antoine/py3k/__svn__/Lib/_strptime.py", line 98, in
__calc_month
    a_month = [calendar.month_abbr[i].lower() for i in range(13)]
  File "/home/antoine/py3k/__svn__/Lib/_strptime.py", line 98, in <listcomp>
    a_month = [calendar.month_abbr[i].lower() for i in range(13)]
  File "/home/antoine/py3k/__svn__/Lib/calendar.py", line 60, in __getitem__
    return funcs(self.format)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1-3:
invalid data
msg86951 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-05-02 15:10
Well, sorry for the message above. There is a problem but it is with
strptime() actually.

>>> time.strptime("2009-12-01", "%Y-%m-%d")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antoine/py3k/__svn__/Lib/_strptime.py", line 461, in
_strptime_time
    return _strptime(data_string, format)[0]
  File "/home/antoine/py3k/__svn__/Lib/_strptime.py", line 307, in _strptime
    _TimeRE_cache = TimeRE()
  File "/home/antoine/py3k/__svn__/Lib/_strptime.py", line 188, in __init__
    self.locale_time = LocaleTime()
  File "/home/antoine/py3k/__svn__/Lib/_strptime.py", line 72, in __init__
    self.__calc_month()
  File "/home/antoine/py3k/__svn__/Lib/_strptime.py", line 98, in
__calc_month
    a_month = [calendar.month_abbr[i].lower() for i in range(13)]
  File "/home/antoine/py3k/__svn__/Lib/_strptime.py", line 98, in <listcomp>
    a_month = [calendar.month_abbr[i].lower() for i in range(13)]
  File "/home/antoine/py3k/__svn__/Lib/calendar.py", line 60, in __getitem__
    return funcs(self.format)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1-3:
invalid data
msg86952 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-05-02 15:13
Well, it turns out that strftime() is buggy as well:

>>> tp = time.strptime("2009-12-01", "%Y-%m-%d")
>>> locale.setlocale(locale.LC_TIME, "fr_FR.ISO8859-15")
'fr_FR.ISO8859-15'
>>> time.strftime("%B", tp)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1-3:
invalid data
msg88515 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-05-29 16:37
This is a duplicate of issue 3061.
History
Date User Action Args
2009-05-29 16:37:20loewissetstatus: open -> closed

nosy: + loewis
messages: + msg88515

superseder: time.strftime() always decodes result with UTF-8
resolution: duplicate
2009-05-02 15:13:46pitrousetmessages: + msg86952
2009-05-02 15:10:46pitrousetmessages: + msg86951
2009-05-02 15:08:29pitrousetpriority: high
stage: test needed
type: behavior
versions: + Python 3.1
2009-05-02 15:08:14pitrousetnosy: + pitrou
messages: + msg86950
2009-05-02 14:16:07ezio.melottisetmessages: + msg86948
2009-05-02 13:35:33ezio.melottisetmessages: + msg86947
2009-05-02 13:22:42ezio.melottisetnosy: + ezio.melotti
messages: + msg86945
2009-05-02 13:19:13barry-scottcreate