This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: time zone tests fail on Windows
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: brett.cannon Nosy List: brett.cannon, quiver
Priority: normal Keywords:

Created on 2004-10-03 03:44 by quiver, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
error.txt quiver, 2004-10-03 03:44
escape_re_strptime.diff brett.cannon, 2004-10-03 23:16 Escape all time strings before generating regex
escape_re_strptime23.diff quiver, 2004-10-05 18:56 patch against Python 2.3 branch
Messages (6)
msg22591 - (view) Author: George Yoshida (quiver) (Python committer) Date: 2004-10-03 03:44
Following tests fail on Win 2K(Japanese locale):

# test_strptime.py
test_compile (__main__.TimeRETests) ... FAIL
test_bad_timezone (__main__.StrptimeTests) ... ERROR
test_timezone (__main__.StrptimeTests) ... ERROR
test_day_of_week_calculation 
(__main__.CalculationTests) ... ERROR
test_gregorian_calculation 
(__main__.CalculationTests) ... ERROR
test_julian_calculation (__main__.CalculationTests) ... 
ERROR

# test_time.py
test_strptime (test.test_time.TimeTestCase) ... FAIL
===
They all stem from time zone tests and can be divided 
into two groups:

FAIL of test_compile is basically same as #bug 883604.
 http://www.python.org/sf/883604
Local time values include regular expression's 
metacharacters, but they are not escaped.

The rest is caused because strptime can't parse the 
values of strftime.

>>> import time
>>> time.tzname
('\x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x9e)', '\x93
\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x9e)')
>>> time.strptime(time.strftime('%Z', time.gmtime()))

Traceback (most recent call last):
  File "<pyshell#1>", line 1, in -toplevel-
    time.strptime(time.strftime('%Z', time.gmtime()))
  File "C:\Python24\lib\_strptime.py", line 291, in strptime
    raise ValueError("time data did not match format:  
data=%s  fmt=%s" %
ValueError: time data did not match format:  data=q¬ 
(–B)  fmt=%a %b %d %H:%M:%S %Y

The output of running test_time.py and test_strptime.py 
is attached.
msg22592 - (view) Author: George Yoshida (quiver) (Python committer) Date: 2004-10-03 15:05
Logged In: YES 
user_id=671362

I've found another bug.
Line 167 & 169 of Lib/_strptime.py contains the expression:
 time.tzname[0].lower()

I guess this is intended to normalize alphabets, but for 
multibyte characters this is really dangerous.

>>> import time
>>> time.tzname[0]
'\x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x9e)'
>>> _.lower()
'\x93\x8c\x8b\x9e (\x95w\x8f\x80\x8e\x9e)'

\x95W and \x95w is not the same character.
msg22593 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2004-10-03 23:16
Logged In: YES 
user_id=357491

The .lower() call is intended to normalize since capitalization is not 
standard across OSs.  But if it is a Unicode string it should be fine.  And 
even if it isn't, it is all lowercased for comparison anyway, so as long as 
it is consistent, shouldn't it still work?

As for your example of strptime not being able to parse, you have a bug 
in it; you forgot the format string.  It should have been 
``time.strptime(time.strftime('%Z'), '%Z')``.  Give that a run and let me 
know what the output is.

As for this whole multi-byte issue, is it all being returned as Unicode 
strings, or is it just a regular string?  In other words, what is 
``type(time.tzname[0])`` spitting out?  And what character encoding is 
all of this in (i.e., what should I pass to unicode so as to not have it raise 
UnicodeDecodeError)?

And finally, for the regex metacharacter stuff, why the hell are there 
parentheses in a timezone?!?  Whoever decided that was good did it just 
to upset me.  That does need to be fixed.  Apply the patch I just 
uploaded and let me know if it at least deals with that problem.

Have I mentioned I hate timezones?  In case I haven't, I do.  Thanks for 
catching this all, though, George.
msg22594 - (view) Author: George Yoshida (quiver) (Python committer) Date: 2004-10-05 18:56
Logged In: YES 
user_id=671362

bcannon write:

> The .lower() call is intended to normalize since 
capitalization 
> is not standard across OSs.  But if it is a Unicode string it
> should be fine.  And even if it isn't, it is all lowercased for
> comparison anyway, so as long as it is consistent, shouldn't 
it
> still work?
Hmm.


> As for your example of strptime not being able to parse, you 
have
> a bug in it; you forgot the format string.  It should have 
been 
> ``time.strptime(time.strftime('%Z'), '%Z')``.  Give that a 
run
> and let me know what the output is.

Yeah, it's my fault. I forget to specify a format. Even so,
strptime couldn't parse timezone.

> As for this whole multi-byte issue, is it all being returned as
> Unicod  e strings, or is it just a regular string?  In other
> words, what is ``type(time.tzname[0])`` spitting out?  And 
what
> character encoding is all of this in (i.e., what should I pass
> to unicode so as to not have it raise UnicodeDecodeError)?

It returns strings(not a unicode), and the encoding is cp932.
This is a default encoding of Japanese Windows.

  >>> unicode(time.tzname[0], 'cp932')
  u'\u6771\u4eac (\u6a19\u6e96\u6642)'

> And finally, for the regex metacharacter stuff, why the hell 
ar
> e there parentheses in a timezone?!?  Whoever decided 
that wa
> s good did it just to upset me.

Ask M$ Japan :-;

I don't regard 'Tokyo (standard time)' as an acceptable
representation for time zone at all, but this is what Windows
returns as a time zone on my box.

> That does need to be fixed.  Apply the patch I just 
uploaded and let 
> me know if it at least deals with that problem.

With your patch, all tests succeed without any Error or Fail, 
and
strftime <-> strptime conversions work well. This is a backport
candidate, so I created a new patch against Python 2.3 with
listcomps instead of genexprs.

But there is one problem left.

On IDLE, strptime still can't parse. I haven't looked into it in
details, but probably patch #590913 has something to do with 
it.
This patch sets locale at IDLE's start up time and this can 
affect
behaviors of string-related functions and constants.

  [PEP 263 support in IDLE]
  http://www.python.org/sf/590913

# patch applied
>>> time.strptime(time.strptime('%Z'), '%Z')

Traceback (most recent call last):
  File "<pyshell#93>", line 1, in -toplevel-
    time.strptime(time.strptime('%Z'), '%Z')
  File "C:\Python24\lib\_strptime.py", line 291, in strptime
    if not found:
ValueError: time data did not match format:  data=%Z  fmt=%
a %b %d %H:%M:%S %Y
>>> import locale
>>> locale.getlocale()
['Japanese_Japan', '932']  # culprit?


> Have I mentioned I hate timezones?  In case I haven't, I do.

I agree with you one hundred percent.

--George
msg22595 - (view) Author: George Yoshida (quiver) (Python committer) Date: 2004-10-05 19:12
Logged In: YES 
user_id=671362

Correct my previous post.
There's nothing wrong with strptime on IDLE.

>>> import time
>>> time.strptime(time.strftime('%Z'), '%Z')
(1900, 1, 1, 0, 0, 0, 0, 1, 0)

Please close this bug and apply the patches.
Thanks Brett!
msg22596 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2004-10-06 02:17
Logged In: YES 
user_id=357491

rev. 1.33 on HEAD and rev. 1.23.4.5 on 2.3 have the fix.  Thanks for the 
help, George.
History
Date User Action Args
2022-04-11 14:56:07adminsetgithub: 40980
2004-10-03 03:44:49quivercreate