[Wranglers] Turbo-charge Snowflake Query Performance by Avoiding Spilling to Storage | by John Ryan | Snowflake | Feb, 2024 | Medium

Ali, Saqib docbook.xml at gmail.com
Sun Mar 17 13:58:41 UTC 2024


A nice blog post by John Ryan on how to turbo-charge Query Performance in
Snowflake by avoiding spilling to storage:
https://medium.com/snowflake/turbo-charge-snowflake-query-performance-by-avoiding-spilling-to-storage-d693def762c2

However this strategy must be used with care, lest you consume a lot of
Snowflake Credits with nominal query performance improvement. One case I
recently ran into was JOINing to a history table which did not have a
Cluster Key defined on the as_of_date field that was being used for the
JOIN. There was spillage to local storage. Moving to a large WH avoided
Spillage to Local Storage, but the query performance did not improve that
much. After analyzing the query plan, we noticed the JOIN to the history
table was taking 60% of the execution time.  We added a Cluster Key on the
field that was used for the JOIN, and the query time was cut in less than
half. So just avoiding the Storage Spillage is not enough, you have to
analyze the query plan to ensure that the JOINs are not causing the
slowness.

Note that Automatic Clustering was enabled on the history table. But this
doesn't alway work. If you know the column that is used in the JOINs,
define that as the Cluster Key on the table.

Saqib
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://analyticsengineering.net/pipermail/wranglers/attachments/20240317/aa5de9b2/attachment.htm>


More information about the Wranglers mailing list