Issue2736
Created on 2008-05-01 21:03 by tebeka, last changed 2009-01-13 23:42 by haypo.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | Remove |
| add-datetime-totimestamp-method.diff | hodgestar, 2008-05-10 14:55 | Implementation of datetime.datetime.timetuple and tests. | ||
| add-datetime-totimestamp-method-docs.diff | hodgestar, 2008-05-10 15:54 | |||
| datetime_totimestamp-3.patch | haypo, 2008-12-12 01:15 | |||
| Messages (26) | |||
|---|---|---|---|
| msg66045 - (view) | Author: Miki Tebeka (tebeka) | Date: 2008-05-01 21:03 | |
If you try to convert datetime objects to seconds since epoch and back it will not work since the microseconds get lost: >>> dt = datetime(2008, 5, 1, 13, 35, 41, 567777) >>> seconds = mktime(dt.timetuple()) >>> datetime.fromtimestamp(seconds) == dt False Current fix is to do >>> seconds += (dt.microsecond / 1000000.0) >>> datetime.fromtimestamp(seconds) == dt True |
|||
| msg66140 - (view) | Author: Pedro Werneck (werneck) | Date: 2008-05-03 02:18 | |
That's expected as mktime is just a thin wrapper over libc mktime() and it does not expect microseconds. Changing time.mktime doesn't seems an option, so the best alternative is to implement a method in datetime type. Is there a real demand for C code implementing this to justify it? |
|||
| msg66532 - (view) | Author: Simon Cross (hodgestar) | Date: 2008-05-10 14:55 | |
Attached a patch which adds a .totimetuple(...) method to datetime.datetime and tests for it. The intention is that the dt.totimetuple(...) method is equivalent to: mktime(dt.timetuple()) + (dt.microsecond / 1000000.0) |
|||
| msg66539 - (view) | Author: Simon Cross (hodgestar) | Date: 2008-05-10 15:54 | |
Patch adding documentation for datetime.totimestamp(...). |
|||
| msg66601 - (view) | Author: Miki Tebeka (tebeka) | Date: 2008-05-11 06:12 | |
I think the name is not good, should be "toepoch" or something like that. |
|||
| msg66610 - (view) | Author: Neil Muller (Neil Muller) | Date: 2008-05-11 08:25 | |
datetime has fromtimestamp already, so using totimestamp keeps naming consistency (see toordinal and fromordinal). |
|||
| msg75723 - (view) | Author: STINNER Victor (haypo) | Date: 2008-11-11 03:12 | |
See also issue1673409 |
|||
| msg75899 - (view) | Author: STINNER Victor (haypo) | Date: 2008-11-15 00:33 | |
I like the method, but I have some comments about the new method: - datetime_totimestamp() is not well indented - "PyObject *time" should be defined at the before the first instruction - why not using "if (time == NULL) return NULL;" directly instead of using a block in case of time is not NULL? - there are reference leaks: timetuple, timestamp and PyFloat_FromDouble() I wrote a similar patch before reading add-datetime-totimestamp-method.diff which does exactly the same... I attach my patch but both should be merged. |
|||
| msg75900 - (view) | Author: STINNER Victor (haypo) | Date: 2008-11-15 00:41 | |
Here is a merged patch of the three patches. Except the C implementation of datetime_totimestamp() (written by me), all code is written by hodgestar. |
|||
| msg75902 - (view) | Author: Alexander Belopolsky (belopolsky) | Date: 2008-11-15 01:15 | |
I would like to voice my opposition the totimestamp method. Representing time as a float is a really bad idea (originated at Microsoft as I have heard). In addition to the usual numeric problems when dealing with the floating point, the resolution of the floating point timestamp varies from year to year making it impossible to represent high resolution historical data. In my opinion both time.time() returning float and datetime.fromtimestamp() taking a float are both design mistakes and adding totimestamp that produces a float will further promote a bad practice. I would not mind integer based to/from timestamp methods taking and producing seconds or even (second, microsecond) tuples, but I don't think changing fromtimestamp behavior is an option. |
|||
| msg75903 - (view) | Author: STINNER Victor (haypo) | Date: 2008-11-15 01:37 | |
Le Saturday 15 November 2008 02:15:30 Alexander Belopolsky, vous avez écrit : > I don't think changing fromtimestamp behavior is an option. It's too late to break the API (Python3 is in RC stage ;-)), but we can create new methods like: datetime.fromepoch(seconds, microseconds=0) # (int/long, int) datetime.toepoch() -> (seconds, microseconds) # (int/long, int) |
|||
| msg75904 - (view) | Author: Alexander Belopolsky (belopolsky) | Date: 2008-11-15 03:17 | |
On Fri, Nov 14, 2008 at 8:37 PM, STINNER Victor <report@bugs.python.org> wrote: > .. but we can create new methods like: > datetime.fromepoch(seconds, microseconds=0) # (int/long, int) While 1970 is the most popular epoch, I've seen 1900, 2000 and even 2035 (!) being used as well. Similarly, nanoseconds are used in high resolution time sources at least as often as microseconds. This makes fromepoch() ambiguous and it is really unnecessary because it can be written as epoch + timedelta(0, seconds, microseconds). > datetime.toepoch() -> (seconds, microseconds) # (int/long, int) I would much rather have divmod implemented as you suggested in issue2706 . Then toepoch is simply def toepoch(d): x, y = divmod(d, timedellta(0, 1)) return x, y.microseconds |
|||
| msg75912 - (view) | Author: STINNER Victor (haypo) | Date: 2008-11-15 12:19 | |
Le Saturday 15 November 2008 04:17:50 Alexander Belopolsky, vous avez écrit : > it is really unnecessary because it can be > written as epoch + timedelta(0, seconds, microseconds). I tried yesterday and it doesn't work! datetime.datetime(1970, 1, 1, 1, 0) >>> t1 = epoch + timedelta(seconds=-1660000000) >>> t2 = datetime.fromtimestamp(-1660000000) >>> t2 datetime.datetime(1917, 5, 26, 1, 53, 20) >>> t1 - t2 datetime.timedelta(0) >>> t2 = datetime.fromtimestamp(-1670000000) >>> t2 datetime.datetime(1917, 1, 30, 7, 6, 40) >>> t1 = epoch + timedelta(seconds=-1670000000) >>> t1 - t2 datetime.timedelta(0, 3600) We lost an hour durint the 1st World War :-) Whereas my implementation using mktime() works: -1670000000.0 |
|||
| msg76003 - (view) | Author: Anders J. Munch (andersjm) | Date: 2008-11-18 09:05 | |
Any thoughts to time zone/DST handling for naive datetime objects? E.g. suppose the datetime object was created by .utcnow or .utcfromtimestamp. For aware datetime objects, I think the time.mktime(dt.timetuple()) approach doesn't work; the tz info is lost in the conversion to time tuple. |
|||
| msg76324 - (view) | Author: David Fraser (davidfraser) | Date: 2008-11-24 14:04 | |
----- "Alexander Belopolsky" <report@bugs.python.org> wrote: > Alexander Belopolsky <belopolsky@users.sourceforge.net> added the > comment: > > I would like to voice my opposition the totimestamp method. > > Representing time as a float is a really bad idea (originated at > Microsoft as I have heard). In addition to the usual numeric problems > when dealing with the floating point, the resolution of the floating > point timestamp varies from year to year making it impossible to > represent high resolution historical data. > > In my opinion both time.time() returning float and > datetime.fromtimestamp() taking a float are both design mistakes and > adding totimestamp that produces a float will further promote a bad > practice. The point for me is that having to interact with Microsoft systems that require times means that the conversions have to be done. Is it better to have everybody re-implement this, with their own bugs, or to have a standard implementation? I think it's clearly better to have it as a method on the object. Of course, we should put docs in describing the pitfalls of this approach... |
|||
| msg76327 - (view) | Author: Alexander Belopolsky (belopolsky) | Date: 2008-11-24 15:17 | |
On Mon, Nov 24, 2008 at 9:04 AM, David Fraser <report@bugs.python.org> wrote: ... > The point for me is that having to interact with Microsoft systems that require times means that the conversions have to be done. I did not see the "epoch" proposal as an interoperability with Microsoft systems feature. If this is the goal, a deeper analysis of the Microsoft standards is in order. For example, what is the valid range of the floating point timestamp? What is the range for which fromepoch (float to datetime) translation is valid? For example, if all floats are valid timestamps, then fromepoch can be limited to +/- 2**31 or to a smaller range where a float has enough precision to roundtrip microseconds. > Is it better to have everybody re-implement this, with their own bugs, or to have a standard implementation? As far as I know, interoperability with Microsoft systems requires re-implementation of their bugs many of which are not documented. For example, OOXML requires that 1900 be treated as a leap year at least in some cases. When you write your own implementation, at least you have the source code to your own bugs. > I think it's clearly better to have it as a method on the object. Of course, we should put docs in describing the pitfalls of this approach... Yes, having a well documented high resolution "time since epoch" to "local datetime" method in the datetime module is helpful if non-trivial timezones (such as the one Victor lives in) are supported. However, introducing floating point pitfalls into the already overcomplicated realm of calendar calculations would be a mistake. I believe the correct approach would be to extend fromtimestamp (and utcfromtimestamp) to accept a (seconds, microseconds) tuple as an alternative (and in addition) to the float timestamp. Then totimestamp can be implemented to return such tuple that fromtimestamp(totimestamp(dt) == dt for any datetime dt and totimestamp(fromtimestamp((s,us))) == (s, us) for any s and us within datetime valid range (note that s will have to be a long integer to achieve that). In addition exposing the system gettimeofday in the time module to produce (s, us) tuples may be helpful to move away from float timestamps produced by time.time(), but with totimestamp as proposed above that would be equivalent to datetime.now().totimestamp(). |
|||
| msg76329 - (view) | Author: STINNER Victor (haypo) | Date: 2008-11-24 15:46 | |
About the timestamp, there are many formats: (a) UNIX: 32 bits signed integer, number of seconds since the 1st january 1970. - file format: gzip header, Portable Executable (PE, Windows), compiled python script header (.pyc/.pyo) - file system: ext2 and ext3 (b) UNIX64: 64 bits signed integer, number of seconds since the 1st january 1970 - file format: Gnome keyring (c) UNIX: 32 bits unsigned integer, number of seconds since the 1st january 1904 - file format: True Type Font (.ttf), iTunes database, AIFF, .MOV (d) UUID60: 60 bits unsigned integer, number of 1/10 microseconds since the 15st october 1582 - all formats using UUID version 1 (also known as "GUID" in the Microsoft world) (e) Win64: 64 bits unsigned integer, number of 1/10 microseconds since the 1st january 1601 - file format: Microsoft Office documents (.doc, .xls, etc), ASF video (.asf), Windows link (.lnk) - file system: NTFS (f) MSDOS DateTime or TimeDate: bitfield with 16 bits for the date and 16 bits for the time. Time precision is 2 seconds, year is in range [1980; 2107] - file format: Windows link (.lnk), CAB archive (.cab), Word document (.doc), ACE archive (.ace), ZIP archive (.zip), RAR achive (.rar) |
|||
| msg76331 - (view) | Author: STINNER Victor (haypo) | Date: 2008-11-24 16:00 | |
Timedelta formats: (a) Win64: 64 bits unsigned integer, number of 1/10 microsecond - file format: Microsoft Word document (.doc), ASF video (.asf) (b) 64 bits float, number of seconds - file format: AMF metadata used in Flash video (.flv) Other file formats use multiple numbers to store a duration: [AVI video] - 3 integers (32 bits unsigned): length, rate, scale - seconds = length / (rate / scale) - (seconds = length * scale / rate) [WAV audio] - 2 integers (32 bits unsigned): us_per_frame, total_frame - seconds = total_frame * (1000000 / us_per_frame) [Ogg Vorbis] - 2 integers: sample_rate (32 bits unsigned), position (64 bits unsigned) - seconds = position / sample_rate |
|||
| msg76332 - (view) | Author: Alexander Belopolsky (belopolsky) | Date: 2008-11-24 16:14 | |
That's an impressive summary, but what is your conclusion? I don't see any format that will benefit from a subsecond timedelta.totimestamp(). Your examples have either multisecond or submicrosecond resolution. On Mon, Nov 24, 2008 at 11:00 AM, STINNER Victor <report@bugs.python.org> wrote: > > STINNER Victor <victor.stinner@haypocalc.com> added the comment: > > Timedelta formats: > > (a) Win64: 64 bits unsigned integer, number of 1/10 microsecond > - file format: Microsoft Word document (.doc), ASF video (.asf) > > (b) 64 bits float, number of seconds > - file format: AMF metadata used in Flash video (.flv) > > Other file formats use multiple numbers to store a duration: > > [AVI video] > - 3 integers (32 bits unsigned): length, rate, scale > - seconds = length / (rate / scale) > - (seconds = length * scale / rate) > > [WAV audio] > - 2 integers (32 bits unsigned): us_per_frame, total_frame > - seconds = total_frame * (1000000 / us_per_frame) > > [Ogg Vorbis] > - 2 integers: sample_rate (32 bits unsigned), position (64 bits > unsigned) > - seconds = position / sample_rate > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue2736> > _______________________________________ > |
|||
| msg76340 - (view) | Author: STINNER Victor (haypo) | Date: 2008-11-24 17:13 | |
Ooops, timestamp (c) is the *Mac* timestamp: seconds since the 1st january 1904. > what is your conclusion? Hum, it's maybe not possible to choose between integer and float. Why not supporting both? Example: - totimestamp()->int: truncate microseconds - totimestamp(microseconds=True)->float: with microseconds Attached file (timestamp.py) is a module to import/export timestamp in all listed timestamp formats. It's written in pure Python. ---------------- >>> import timestamp >>> from datetime import datetime >>> now = datetime.now() >>> now datetime.datetime(2008, 11, 24, 18, 7, 50, 216762) >>> timestamp.exportUnix(now) 1227550070 >>> timestamp.exportUnix(now, True) 1227550070.2167621 >>> timestamp.exportMac(now) 3310394870L >>> timestamp.exportWin64(now) 128720236702167620L >>> timestamp.exportUUID(now) 134468428702167620L >>> timestamp.importMac(3310394870) datetime.datetime(2008, 11, 24, 18, 7, 50) >>> timestamp.importUnix(1227550070) datetime.datetime(2008, 11, 24, 18, 7, 50) >>> timestamp.importUnix(1227550070.2167621) datetime.datetime(2008, 11, 24, 18, 7, 50, 216762) ---------------- It supports int and float types for import and export. |
|||
| msg76344 - (view) | Author: Alexander Belopolsky (belopolsky) | Date: 2008-11-24 17:33 | |
On Mon, Nov 24, 2008 at 12:13 PM, STINNER Victor <report@bugs.python.org> wrote: .. > Hum, it's maybe not possible to choose between integer and float. Why > not supporting both? Example: > - totimestamp()->int: truncate microseconds > - totimestamp(microseconds=True)->float: with microseconds I would still prefer totimestamp()->(int, int) returning (sec, usec) tuple. The important benefit is that such totimestamp() will not loose information and will support more formats than either of your ->int or ->float variants. The ->int can then be spelt simply as totimestamp()[0] and on systems with numpy (which is likely for users that deal with floats a lot), totimestamp(microseconds=True) is simply dot([1, 1e-6], totimestamp()). (and s,us = totimestamp(); return s + us * 1e-6 is not that hard either.) |
|||
| msg76345 - (view) | Author: STINNER Victor (haypo) | Date: 2008-11-24 17:34 | |
> > Hum, it's maybe not possible to choose between integer and float. Why > > not supporting both? Example: > > - totimestamp()->int: truncate microseconds > > - totimestamp(microseconds=True)->float: with microseconds > > I would still prefer totimestamp()->(int, int) returning (sec, usec) > tuple. The important benefit is that such totimestamp() will not > loose information Right, I prefer your solution ;-) |
|||
| msg76351 - (view) | Author: Alexander Belopolsky (belopolsky) | Date: 2008-11-24 18:07 | |
On Mon, Nov 24, 2008 at 12:34 PM, STINNER Victor <report@bugs.python.org> wrote: .. >> I would still prefer totimestamp()->(int, int) returning (sec, usec) >> tuple. The important benefit is that such totimestamp() will not >> loose information > > Right, I prefer your solution ;-) > Great! What do you think about extending fromtimestamp(timestamp[, tz]) and utcfromtimestamp(timestamp) to accept a tuple for the timestamp? Also, are you motivated enough to bring this up on python-dev to get a community and BDFL blessings? I think this has a chance to be approved. |
|||
| msg76352 - (view) | Author: David Fraser (davidfraser) | Date: 2008-11-24 18:09 | |
----- "STINNER Victor" <report@bugs.python.org> wrote: > STINNER Victor <victor.stinner@haypocalc.com> added the comment: > > Timedelta formats: > > (a) Win64: 64 bits unsigned integer, number of 1/10 microsecond > - file format: Microsoft Word document (.doc), ASF video (.asf) > > (b) 64 bits float, number of seconds > - file format: AMF metadata used in Flash video (.flv) There are also the PyWinTime objects returned by PythonWin COM calls which are basically FILETIMEs I don't have time to get the details now but I recently submitted a patch to make them work with milliseconds - see http://sourceforge.net/tracker/index.php?func=detail&aid=2209864&group_id=78018&atid=551954 (yes I know this is a bit off-topic here) |
|||
| msg77650 - (view) | Author: STINNER Victor (haypo) | Date: 2008-12-12 01:15 | |
belopolsky will be happy to see this new version of my patch: - datetime.totimestamp() => (seconds, microseconds): two integers - datetime.totimestamp() implement don't use Python time.mktime() but directly the C version of mktime() because time.mktime() creates a float value - fix time.mktime() to support the timestamp -1 (first second before the epoch) to make it consistent with datetime.totimestamp() which also support this value - fix documentation: it's microseconds (10^-6) and not milliseconds (10^-3) |
|||
| msg77651 - (view) | Author: STINNER Victor (haypo) | Date: 2008-12-12 01:19 | |
About mktime() -> -1: see the Issue1726687 (I found the fix in this issue). Next job will be to patch datetime.(utc)fromtimestamp() to support (int, int). I tried to write such patch but it's not easy because fromtimestamp() will support: int, long, float, (int, int), (int, long), (long, int) and (long, long). And I don't know if a "long" value can be converted to "time_t". |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2009-01-13 23:42:58 | haypo | set | files: - timestamp.py |
| 2009-01-13 23:42:46 | haypo | set | files: - datetime_totimestamp-2.patch |
| 2009-01-13 23:42:41 | haypo | set | files: - datetime_totimestamp.patch |
| 2008-12-12 01:19:58 | haypo | set | messages: + msg77651 |
| 2008-12-12 01:15:50 | haypo | set | files:
+ datetime_totimestamp-3.patch messages: + msg77650 |
| 2008-11-24 18:09:25 | davidfraser | set | messages: + msg76352 |
| 2008-11-24 18:07:44 | belopolsky | set | messages: + msg76351 |
| 2008-11-24 17:34:56 | haypo | set | messages: + msg76345 |
| 2008-11-24 17:33:05 | belopolsky | set | messages: + msg76344 |
| 2008-11-24 17:13:43 | haypo | set | files:
+ timestamp.py messages: + msg76340 |
| 2008-11-24 16:14:29 | belopolsky | set | messages: + msg76332 |
| 2008-11-24 16:00:40 | haypo | set | messages: + msg76331 |
| 2008-11-24 15:46:11 | haypo | set | messages: + msg76329 |
| 2008-11-24 15:17:37 | belopolsky | set | messages: + msg76327 |
| 2008-11-24 14:04:48 | davidfraser | set | messages: + msg76324 |
| 2008-11-18 09:05:29 | andersjm | set | nosy:
+ andersjm messages: + msg76003 |
| 2008-11-15 12:19:37 | haypo | set | messages: + msg75912 |
| 2008-11-15 03:17:48 | belopolsky | set | messages: + msg75904 |
| 2008-11-15 01:37:40 | haypo | set | messages: + msg75903 |
| 2008-11-15 01:15:28 | belopolsky | set | nosy:
+ belopolsky messages: + msg75902 |
| 2008-11-15 00:41:07 | haypo | set | files:
+ datetime_totimestamp-2.patch messages: + msg75900 |
| 2008-11-15 00:33:12 | haypo | set | files:
+ datetime_totimestamp.patch messages: + msg75899 |
| 2008-11-11 03:12:38 | haypo | set | nosy:
+ haypo messages: + msg75723 |
| 2008-05-11 08:25:30 | Neil Muller | set | nosy:
+ Neil Muller messages: + msg66610 |
| 2008-05-11 06:12:39 | tebeka | set | messages: + msg66601 |
| 2008-05-10 20:48:17 | davidfraser | set | nosy: + davidfraser |
| 2008-05-10 15:54:41 | hodgestar | set | files:
+ add-datetime-totimestamp-method-docs.diff messages: + msg66539 |
| 2008-05-10 14:55:39 | hodgestar | set | files:
+ add-datetime-totimestamp-method.diff keywords: + patch messages: + msg66532 nosy: + hodgestar |
| 2008-05-03 02:18:59 | werneck | set | nosy:
+ werneck messages: + msg66140 |
| 2008-05-01 21:03:25 | tebeka | create | |