This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: strptime(.., '%c') fails to parse output of strftime('%c', ..) in some locales
Type: behavior Stage: needs patch
Components: Extension Modules Versions: Python 3.7
process
Status: open Resolution:
Dependencies: 8915 Superseder:
Assigned To: Nosy List: belopolsky, ezio.melotti, rpetrov, vstinner
Priority: normal Keywords: patch

Created on 2010-06-09 20:24 by belopolsky, last changed 2022-04-11 14:57 by admin.

Files
File name Uploaded Description Edit
strptime-locale-bug.c belopolsky, 2010-06-09 20:24 Working C code
strptime-locale-bug.py belopolsky, 2010-06-09 20:25 Failing python code
cfmt.py belopolsky, 2011-01-12 00:08
issue8957.py3k.1.patch eli.bendersky, 2011-01-15 07:20 review
Messages (18)
msg107413 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-06-09 20:24
The following code:

import locale, time
locale.setlocale(locale.LC_ALL, "fr_FR.UTF-8")
t = time.localtime()
s = time.strftime('%c', t)
time.strptime('%c', s)

Raises

ValueError: time data '%c' does not match format 'Mer  9 jui 16:14:46 2010'

in any locale where month follows day in '%c' format.  Note that attached C code works as expected on my OSX laptop.

I wonder it it would make sense to call platform strptime where available? I wonder if platform support for strptime has improved since 2002 when _strptime.py was introduced.
msg125966 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-01-11 00:18
Adding #8915 as a dependency because deducing D_T_FMT locale setting from strftime output seems impossible:

>>> locale.nl_langinfo(locale.D_T_FMT)
'%a %b %e %H:%M:%S %Y'
msg125968 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-01-11 00:24
Victor,

You may be interested because your native language is implicated. :-)
msg126043 - (view) Author: Roumen Petrov (rpetrov) * Date: 2011-01-11 22:40
time.strptime(s, '%c' ) ?
msg126045 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-01-11 22:45
> time.strptime(s, '%c' ) ?

Oh my.  It certainly took a long time to recognize a silly mistake!

Thanks.
msg126055 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-01-12 00:08
My tests were wrong but the problem does exist.  I am attaching a script that tests strptime(.., '%c') for all locales installed on my system (an unmodified  US Mac OS X 10.6.6).  

The only failing locale that I recognize is Hebrew (he_IL).  Eli, what do you think about this?

 
$ ./python.exe cfmt.py 
am_ET [ማክሰ ጃንዩ 11 18:56:18 2011] %A %B %d %H:%M:%S %Y != %a %b %e %H:%M:%S %Y
et_EE [T, 11. jaan  2011. 18:56:18] %a, %d. %B %Y. %H:%M:%S != %a, %d. %b %Y. %T
he_IL [EST 18:56:18 2011 ינו 11 ג'] %Z %H:%M:%S %Y %B %d %a != %Z %H:%M:%S %Y %b %d %a
msg126057 - (view) Author: Roumen Petrov (rpetrov) * Date: 2011-01-12 00:26
- %T is equal for %H:%M:%S
- locales with %A and %B are broken on this platform as %c is "Appropriate date and time representation (%c) with abbreviations"
msg126059 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-01-12 00:34
On Tue, Jan 11, 2011 at 7:26 PM, Roumen Petrov <report@bugs.python.org> wrote:
..
> - locales with %A and %B are broken on this platform as %c is "Appropriate date and time representation (%c) with abbreviations"

According to what standard? POSIX defines it as

%c Replaced by the locale's appropriate date and time representation.

http://pubs.opengroup.org/onlinepubs/009695399/functions/strftime.html

and the manual page on my system agrees:

 %c    is replaced by national representation of time and date.
msg126084 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-12 10:54
On Linux, cfmt.py fails on fr_FR locale (the only valid locale in the list of tested locales):
---
fr_FR [mer. 12 janv. 2011 11:30:35 CET] %a %d %B %Y %H:%M:%S %Z != %a %d %b %Y %T %Z
---

The problem is the month format: locale.nl_langinfo(locale.D_T_FMT) returns '%a %d %b %Y %T %Z', but _strptime (LocaleTime().LC_date_time) uses '%a %d %B %Y %H:%M:%S %Z' => '%b' vs '%B'.

_strptime.LocalTime.__calc_date_time() uses strftime('%c') and then parse the output to get the complete format. But it uses strftime('%c') with the march month, and in french, march is formatted 'mars' for both month formats (%b *and* %B).

_strptime.LocalTime.__calc_date_time() should detect that the month has the same format with %b and %B, and try other timestamps (other months).
msg126235 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-01-14 07:18
Alexander, I get the same error for the he_IL locale. Will look into this
msg126313 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-01-15 06:06
The problem for Hebrew appears to be the same as the one Victor stated for French. March in Hebrew is also a 3-letter word which means it's equal to its abbreviation.
msg126316 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-01-15 07:20
I'm attaching a patch for Lib/_strptime.py that handles the month differently in __calc_date_time. It cycles all months, trying to find one where the full and abbrev names are different and matches it against the timestamp created by strftime. 

This solution is a hack, but so is the whole __calc_date_time function :-) [IMHO]

All tests pass and I also tried it manually with all the problematic locales reported by Alexander - seems to work correctly.

If this looks OK to you guys I can commit and backport.
msg126339 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-01-15 16:42
On Sat, Jan 15, 2011 at 2:20 AM, Eli Bendersky <report@bugs.python.org> wrote:
..
> This solution is a hack, but so is the whole __calc_date_time function :-) [IMHO]
>

I am not sure how to proceed.   On one hand, I opened this issue to
demonstrate that the current implementation is flawed, on the other
hand, Eli has succeeded in improving the hack so that we can live with
it a bit longer.  Note that I did not have any real life application
that would misbehave because of this bug and I don't think developers
expect %c format to be parseable in the first place.

I made this issue depend on #8915 because I think strptime should
query the locale for format information directly rather than reverse
engineer what strftime does.

I don't think this fix solves all the problems.  For example, in most
locales (including plain C locale), day of the month in %c format uses
%e format, but current implementation guesses it as %d:

'%a %b %e %H:%M:%S %Y'
>>> LocaleTime().LC_date_time
'%a %b %d %H:%M:%S %Y'

This does not seem to be an issue because strptime with %d seems to be
able to parse space-filled as well as zero-filled numbers.  However,
there may be platforms that are less forgiving.

On the patch itself:

1. Unit tests are needed.

2. Please don't use datetime as a local variable.

3. I am not sure what the purpose of .lower() is.  Are a_month and
f_month lowercased?

4. Please keep lines under 79 characters long.

5. "for m in range(1, 13)" loop is better written as "for am, fm in
zip(self.a_month, self.f_month)"

Eli, what do you think yourself:  should we try to perfect the hack or
is it better to reimplement strptime using locale?  Note that the
latter may be a stepping stone to implementing strftime as well.
msg126340 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-01-15 16:47
6. datetime.find(self.f_month[m]) >= 0 -> self.f_month[m] in datetime

Python is not C!
msg126345 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-01-15 18:23
Alexander,

1) Patch comments - thanks for those. Will have them fixed.
2) General strategy for implementing strptime. I must confess I don't fully understand the reason for doing what the _strptime module does. Standard C AFAIK has nothing of the sort - it only has strftime and strptime, both using a given format string. Neither tries to guess it from an actual formatted time! Does it exist just to circumvent platforms where strptime isn't implemented in C or is buggy? Can you please shed some light on this (or point me somewhere)?

With understanding of (2) I will be able to also logically reason about the next steps :-)
msg126347 - (view) Author: Alexander Belopolsky (Alexander.Belopolsky) Date: 2011-01-15 18:48
You pretty much hit the nail on the head. Some platforms don't  have strptime or did not have it at the time this code was written. The locale module is probably more recent than this code as well.
msg126358 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2011-01-16 05:28
Alexander, but still - this isn't just an implementation of strptime. strptime, AFAIU strptime gets the format string as a parameter and uses it to parse a date string into a "tm" struct. So why do we need to parse a date string *without* a format string in Python, resorting to heuristics and pseudo-AI instead?
msg221924 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014-06-30 00:12
Eli,

Given your last comment, are you still proposing your patch for inclusion or should we take the #8915 approach?
History
Date User Action Args
2022-04-11 14:57:02adminsetgithub: 53203
2016-09-26 21:29:29belopolskysetversions: + Python 3.7, - Python 3.5
2014-06-30 00:12:46belopolskysetversions: + Python 3.5, - Python 3.3
nosy: - Alexander.Belopolsky

messages: + msg221924

assignee: belopolsky ->
2013-10-08 13:11:18eli.benderskysetnosy: - eli.bendersky
2011-01-16 05:28:45eli.benderskysetnosy: belopolsky, vstinner, ezio.melotti, rpetrov, eli.bendersky, Alexander.Belopolsky
messages: + msg126358
2011-01-15 18:48:39Alexander.Belopolskysetnosy: + Alexander.Belopolsky
messages: + msg126347
2011-01-15 18:23:12eli.benderskysetnosy: belopolsky, vstinner, ezio.melotti, rpetrov, eli.bendersky
messages: + msg126345
2011-01-15 16:47:23belopolskysetnosy: belopolsky, vstinner, ezio.melotti, rpetrov, eli.bendersky
messages: + msg126340
2011-01-15 16:42:26belopolskysetnosy: belopolsky, vstinner, ezio.melotti, rpetrov, eli.bendersky
messages: + msg126339
2011-01-15 07:20:49eli.benderskysetfiles: + issue8957.py3k.1.patch

messages: + msg126316
keywords: + patch
nosy: belopolsky, vstinner, ezio.melotti, rpetrov, eli.bendersky
2011-01-15 06:06:09eli.benderskysetnosy: belopolsky, vstinner, ezio.melotti, rpetrov, eli.bendersky
messages: + msg126313
2011-01-14 07:18:30eli.benderskysetnosy: belopolsky, vstinner, ezio.melotti, rpetrov, eli.bendersky
messages: + msg126235
2011-01-12 10:54:54vstinnersetnosy: belopolsky, vstinner, ezio.melotti, rpetrov, eli.bendersky
messages: + msg126084
2011-01-12 00:34:42belopolskysetnosy: belopolsky, vstinner, ezio.melotti, rpetrov, eli.bendersky
messages: + msg126059
2011-01-12 00:26:39rpetrovsetnosy: belopolsky, vstinner, ezio.melotti, rpetrov, eli.bendersky
messages: + msg126057
2011-01-12 00:09:29belopolskysetnosy: belopolsky, vstinner, ezio.melotti, rpetrov, eli.bendersky
title: strptime('%c', ..) fails to parse output of strftime('%c', ..) in non-English locale -> strptime(.., '%c') fails to parse output of strftime('%c', ..) in some locales
2011-01-12 00:08:40belopolskysetstatus: closed -> open
files: + cfmt.py
dependencies: + Use locale.nl_langinfo in _strptime


nosy: + eli.bendersky
messages: + msg126055
resolution: not a bug ->
2011-01-11 23:00:27belopolskysetstatus: open -> closed
nosy: belopolsky, vstinner, ezio.melotti, rpetrov
2011-01-11 22:45:49belopolskysetnosy: belopolsky, vstinner, ezio.melotti, rpetrov
dependencies: - Use locale.nl_langinfo in _strptime
messages: + msg126045
resolution: not a bug
2011-01-11 22:40:34rpetrovsetnosy: + rpetrov
messages: + msg126043
2011-01-11 00:24:33belopolskysetnosy: + vstinner
messages: + msg125968
2011-01-11 00:18:19belopolskysetnosy: belopolsky, ezio.melotti
dependencies: + Use locale.nl_langinfo in _strptime
messages: + msg125966
versions: + Python 3.3, - Python 3.2
2010-06-13 08:12:04ezio.melottisetnosy: + ezio.melotti
2010-06-09 20:25:05belopolskysetfiles: + strptime-locale-bug.py
2010-06-09 20:24:38belopolskycreate