How to deal with Snowflake string length reporting #395

jonashaag · 2023-09-20T16:34:29Z

Snowflake's ODBC driver reports all CHAR columns as having length 16 MiB (the maximum length of a string in Snowflake). Not sure if this is a problem with the ODBC driver or with Turbodbc.

Currently you can't really read batches of a reasonable number of rows because Turbodbc thinks all rows require a n_string_cols * 16 MiB buffer.

The text was updated successfully, but these errors were encountered:

jonashaag · 2023-10-09T16:34:18Z

arrow-odbc-py solution: pacman82/arrow-odbc-py#56

pacman82 · 2023-10-09T16:47:20Z

I fear it is a problem with the ODBC standard 😅

ODBC 4.0 is going to fix this since it allows to define exception behavior for truncation then fetching data. Yet this has been a making in the while and is not rolled out. Currently this has to be fixed on the application level. Use domain knowledge to specify more sensible upper bounds. If you do not know them run queries to find them.

I could imagine supporting running an query twice to solve this in arrow-odbc. Once to figure out the maximum length of the values per column and once more to actually produce the result. Yet I always feel applications which know the domains they are operating in could come up with cleaner solutions, or at least use persistence to remember the previous maximum columns.

Only alternative in the current ODBC standard is not using bulk fetches, yet this is the very thing, which makes arrow-odbc or turbodbc fast.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to deal with Snowflake string length reporting #395

How to deal with Snowflake string length reporting #395

jonashaag commented Sep 20, 2023

jonashaag commented Oct 9, 2023

pacman82 commented Oct 9, 2023

How to deal with Snowflake string length reporting #395

How to deal with Snowflake string length reporting #395

Comments

jonashaag commented Sep 20, 2023

jonashaag commented Oct 9, 2023

pacman82 commented Oct 9, 2023