[Wranglers] Redshift's generic_query_hash and Snowflake's query_parameterized_hash

Ali, Saqib docbook.xml at gmail.com
Sun Jan 26 00:23:07 UTC 2025


Both Snowflake and Redshift assign unique identifiers to SQL queries that
can be used to track and monitor the performance of queries over time.
Redshift calls it generic_query_hash whereas Snowflake calls it
query_parameterized_hash, but the idea is the same. The hashes are
generated based on the query's textual representation, excluding any
literal values. The hash is usually created by normalizing the query
structure and removing specific parameter values.

For e.g., the following two queries will produce the same
query_parameterized_hash and generic_query_hash

select * from customers where full_name = 'Saqib';
select * from customers where full_name = 'Ali';


This allows for effective trend analysis and comparison of query
performance across different periods. Here are two articles on how to
use query_parameterized_hash or generic_query_hash to identify queries that
are getting worse over time:
https://qosf.com/whats-eating-up-your-snowflake-warehouse-part-2.html
https://qosf.com/query_hash-and-query_parameterized_hash-in-snowflake.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://analyticsengineering.net/pipermail/wranglers/attachments/20250125/fcd461a9/attachment.htm>


More information about the Wranglers mailing list