Skip to content

Drill Date Time Types

Paul Rogers edited this page Mar 20, 2017 · 11 revisions

Drill provides a number of Date/Time types, each in various stages of completion.

The date/time types include:

Drill Type Minor Type SQL Type Value Vector
DATE DATE DATE DateVector
TIME TIME TIME TimeVector
N/A TIMETZ N/A
N/A TIMESTAMPTZ N/A
N/A TIMESTAMP TIMESTAMP TimeStampVector
INTERVAL FROM ... TO ... INTERVAL N/A IntervalVector
N/A INTERVALYEAR N/A IntervalYearVector
N/A INTERVALDAY N/A IntervalDayVector

Documentation information:

Date

Documentation: Drill stores values in Coordinated Universal Time (UTC). Drill supports time functions in the range 1971 to 2037.

Minor Type: DATE
Vector class: DateVector
Width: 8 bytes
Java type: long
Drill SQL constant: date '2017-03-14'
Encoding: Number of ms since 1970-01-01T00:00:00 in the server time zone

Dates are assumed to be in the server time zone. For example CURRENT_DATE returns today, midnight as the number of ms since 1970-01-01T00:00:00 in the server time zone. Presumably, if time conversion functions convert date strings to DATE columns, the values are assumed to be in server local time.

As long as code does not compare values to CURRENT_DATE, the values can be assumed to be in any time zone, allowing Drill to represent relative dates: 2017-03-17 as a concept rather than as a point in time.

Although the documentation says that DATE is stored in UTC, that is only true if the server time zone is UTC, else DATE is stored in the server local time zone. Also, as with TIMESTAMP, the actual range is 1903-2037.

Time

Documentation: 24-hour based time before or after January 1, 2001 in hours, minutes, seconds. Format: HH:mm:ss Drill stores values in Coordinated Universal Time (UTC). Drill supports time functions in the range 1971 to 2037.

Minor Type: TIME
Vector class: TimeVector
Width: 4 bytes
Java type: long

Drill time is an absolute time: it is measured in UTC. This means Drill cannot represent the idea of "let's meet at 2 PM" independent of "2 PM, PST" or "2 PM PDT". Further, since Time include seconds since 2001, it is also a date.

TimeTZ

Minor Type: TIMETZ

Defined in MinorTypes but never implemented.

TimeStamp

Documentation: JDBC timestamp in year, month, date hour, minute, second, and optional milliseconds format: yyyy-MM-dd HH:mm:ss.SSS Drill stores values in Coordinated Universal Time (UTC). Drill supports time functions in the range 1971 to 2037. Drill does not support TIMESTAMP with time zone

Minor Type: TIMESTAMP
Vector class: TimeStampVector
Width: 8 bytes
Java type: long
Representation: Number of milliseconds after the Unix epoch, 1970-01-01T00:00:00 server local time.
JDBC: getTimestamp() with a time zone that depends on the difference between the client and server time zones.

Although the documentation implies that TIMESTAMP is UTC, it is, in fact, in server local time.

Although the documentation says the range is 1971-2037, the range is actually 1903-2037 (the full range of a signed long.)

The NOW() function returns the current time in server local time,

The JDBC driver tries to convert the value to UTC but it does so by converting the server local time to UTC using the client time zone. This works only if the client and server are in the same time zone. Otherwise the returned time is not UTC, but UTC + (client offset - server offset). This is deemed to be a feature.

It seems that JDBC users can learn the server time zone (via a means other than Drill), then pass that to an alternative JDBC method:

   Timestamp realUtc = resultSet.getTimestamp(serverTimeZone);

Common advice is to run both the client and servers in the UTC time zone. Then, a TIMESTAMP is UTC (because the server time zone is UTC), and the client will convert from server time to UTC correctly.

See [this note|http://drill.apache.org/docs/data-type-conversion/#time-zone-limitation] and [this blog entry|http://www.openkb.info/2015/05/understanding-drills-timestamp-and.html#.VUzhotpVhHw] explain the time zone issues, and their workarounds, in detail.

TimeStampTZ

Minor Type: TIMESTAMPTZ

Defined in MinorTypes but never implemented.

Interval

Documentation: A day-time or year-month interval and Internally, INTERVAL is represented as INTERVALDAY or INTERVALYEAR. Also: The INTERVALYEAR and INTERVALDAY internal types represent a period of time. The INTERVALYEAR type specifies values from a year to a month. The INTERVALDAY type specifies values from a day to seconds.

Minor Type: INTERVAL
Vector class: IntervalVector
Width: 16 bytes (Int32 + Int32 + Int32 : Month + Days + Milliseconds)
Java Type: Joda Period

SQL Syntax: INTERVAL '1' YEAR, INTERVAL '2' DAY

IntervalYear

Minor Type: INTERVALYEAR
Vector class: IntervalYearVector
Width: 4 bytes

IntervalDay

Minor Type: INTERVALDAY
Vector class: IntervalDayVector
Width: 12 bytes

Comparison to Other DBs

From this source:

Type SQL MySQL SQL Server
DATE Stores year, month, and day values Format: YYYY-MM-DD. Supported range is from '1000-01-01' to '9999-12-31' From January 1, 0001 to December 31, 9999
TIME Stores hour, minute, and second values A time. Format: HH:MI:SS Time only
DATETIME A date and time combination. Format: YYYY-MM-DD HH:MI:SS From January 1, 1753 to December 31, 9999
DATETIMEOFFSET The same as DATETIME with the addition of a time zone offset
TIMESTAMP Stores year, month, day, hour, minute, and second values Stored as the number of seconds since the Unix epoch ('1970-01-01 00:00:00' UTC). Internal version.
INTERVAL Composed of a number of integer fields, representing a period of time, depending on the type of interval

Implied, Local and Absolute Times

The historical format for dates and times was a simple value: 1/2/1950 or 3 PM. These are "implied" times: values that can be understood only to a local (or implied) time zone. Knowing if "3 PM" is one time zone or another is specified outside of the date value itself: "3 PM my time" or "3 PM in New York."

Later, with increasing international communication, it became important to establish an absolute time, given by UTC: 2001-03-04T10:20:30Z is a date and time relative to Zulu (UTC). In Unix, all times are milliseconds since a specific point in UTC, 1970-01-01T00:00:00. This is called a timestamp and is identical across all Unix/Linux systems regardless of their physical location.

UTC can then be converted to a local time: a date and time with a numeric or named time-zone offset: 2017-03-15T16:47:30PDT.

Local and UTC times can be readily converted, often with the use of a time library that is aware of the politically-defined time zones, daylight savings times and so on.

On the other hand, implied dates and times cannot be converted: we don't know if "3 PM" is in one time zone or another. Further, we don't know if it was when daylight savings time was in effect or not. Similarly, "3/4/15" means a particular date wherever you happen to be: it is not tied to a time zone.

Clone this wiki locally