TIMESTAMP Function Returns Different Results When Sorted By Different Direction

Recently, I was working on an issue for customer who faced a weird behaviour that TIMESTAMP conversion function returned different results if different sorting direction was used, meaning, results are different if using ASC vs DESC.

To illustrate the issue, please see my test case below:

  1. Create a table with testing data:
CREATE OR REPLACE TABLE TS_CONVERT_ORDER AS
SELECT 1 AS ID, '0' AS MY_DATE
UNION
SELECT 2, '2021-02-15';

2. Run below query using Ascending order, the timestamp conversion will fail for ‘2021-02-15’:

SELECT 
  TRY_TO_TIMESTAMP_LTZ(MY_DATE), * 
FROM TS_CONVERT_ORDER 
ORDER BY 1 ASC;

+-------------------------------+----+------------+
| TRY_TO_TIMESTAMP_TZ(MY_DATE)  | ID | MY_DATE    |
|-------------------------------+----+------------|
| 1970-01-01 00:00:00.000 +0000 |  1 | 0          |
| NULL                          |  2 | 2021-02-15 |
+-------------------------------+----+------------+

3. If sort by Descending order, result returned successfully for ‘2021-02-15’, but will be NULL for my_date column with “0” value:

SELECT 
  TRY_TO_TIMESTAMP_TZ(MY_DATE), * 
FROM TS_CONVERT_ORDER 
ORDER BY 1 DESC;

+-------------------------------+----+------------+
| TRY_TO_TIMESTAMP_TZ(MY_DATE)  | ID | MY_DATE    |
|-------------------------------+----+------------|
| 2021-02-15 00:00:00.000 +0000 |  2 | 2021-02-15 |
| NULL                          |  1 | 0          |
+-------------------------------+----+------------+

It puzzled me for a while, but after researching this Snowflake behaviour, it turned out that it was all expected and documented.

This behaviour is controlled by a setting called TIMESTAMP_INPUT_FORMAT. By default, the value is “AUTO“, and based on documentation, its behaviour can change depending on the data it encountered during processing. Please refer to documentation for TRY_TO_TIMESTAMP. Particularly below section:

If more than one row is evaluated (for example, if the input is the column name of a table that contains more than two rows), the first processed value determines whether all subsequent values are treated as seconds, milliseconds, microseconds, or nanoseconds.

So in our example, when we sorted by ASC order, the first value for my_date column was 0, so Snowflake would consider the values in this column was Integer value for seconds. And when the second row came with ‘2021-02-15’, it was considered as invalid and NULL was returned.

In the case of using DESC order, the first value was ‘2021-02-15’, so Snowflake would expect the following rows to be in the Date format, and when second row returned “0”, it was considered invalid and NULL value was returned.

If we change our test case as below, when values for MY_DATE column are consistent, it would have worked:

CREATE OR REPLACE TABLE TS_CONVERT_ORDER AS
SELECT 1 AS ID, '2021-02-14' AS MY_DATE
UNION
SELECT 2, '2021-02-15';

SELECT 
  TRY_TO_TIMESTAMP_TZ(MY_DATE), * 
FROM TS_CONVERT_ORDER
ORDER BY 1 ASC;

+-------------------------------+----+------------+
| TRY_TO_TIMESTAMP_TZ(MY_DATE)  | ID | MY_DATE    |
|-------------------------------+----+------------|
| 2021-02-14 00:00:00.000 +0000 |  1 | 2021-02-14 |
| 2021-02-15 00:00:00.000 +0000 |  2 | 2021-02-15 |
+-------------------------------+----+------------+

SELECT 
  TRY_TO_TIMESTAMP_TZ(MY_DATE), * 
FROM TS_CONVERT_ORDER 
ORDER BY 1 DESC;

+-------------------------------+----+------------+
| TRY_TO_TIMESTAMP_TZ(MY_DATE)  | ID | MY_DATE    |
|-------------------------------+----+------------|
| 2021-02-15 00:00:00.000 +0000 |  2 | 2021-02-15 |
| 2021-02-14 00:00:00.000 +0000 |  1 | 2021-02-14 |
+-------------------------------+----+------------+

In the case that the date/time value passed into TIMESTAMP function is not consistent, I recommend to set TIMESTAMP_INPUT_FORMAT to the expected format, to force Snowflake to expect correct values. In our case, the expected format should be “YYYY-MM-DD”.

See below outputs:

ALTER SESSION SET TIMESTAMP_INPUT_FORMAT = 'YYYY-MM-DD';

SELECT 
  TRY_TO_TIMESTAMP_TZ(MY_DATE), * 
FROM TS_CONVERT_ORDER 
ORDER BY 1 ASC;

+-------------------------------+----+------------+
| TRY_TO_TIMESTAMP_TZ(MY_DATE)  | ID | MY_DATE    |
|-------------------------------+----+------------|
| 2021-02-15 00:00:00.000 +0000 |  2 | 2021-02-15 |
| NULL                          |  1 | 0          |
+-------------------------------+----+------------+

SELECT 
  TRY_TO_TIMESTAMP_TZ(MY_DATE), * 
FROM TS_CONVERT_ORDER 
ORDER BY 1 DESC;

+-------------------------------+----+------------+
| TRY_TO_TIMESTAMP_TZ(MY_DATE)  | ID | MY_DATE    |
|-------------------------------+----+------------|
| NULL                          |  1 | 0          |
| 2021-02-15 00:00:00.000 +0000 |  2 | 2021-02-15 |
+-------------------------------+----+------------+

I am sure that lots of other Snowflake user would have faced the same issue, and I hope that this article can be useful.

Loading

Leave a Reply

Your email address will not be published.