Обсуждение: Query Discrepancy in Postgres HLL Test
Hi PostgreSQL Community,
I'm currently delving into Postgres HLL (HyperLogLog) functionality and have encountered an unexpected behavior while executing queries from the "cumulative_add_sparse_edge.sql" regress test. This particular test data file involves three columns, with the last column representing an HLL (HyperLogLog) value derived from the previous HLL value and the current raw value.
I'm currently delving into Postgres HLL (HyperLogLog) functionality and have encountered an unexpected behavior while executing queries from the "cumulative_add_sparse_edge.sql" regress test. This particular test data file involves three columns, with the last column representing an HLL (HyperLogLog) value derived from the previous HLL value and the current raw value.
Upon manual inspection of the query responsible for deriving the last row's HLL value, I noticed a discrepancy. When executing the query:
"""
-- '\x148B481002....' is second last rows hll value
SELECT hll_add('\x148B481002.....', hll_hashval(2561));
"""
instead of obtaining the expected value (''\x148B481002....''), I received a different output which is ('\x138b48000200410061008100a1 ........').
I am using
postgres=> select version();
version
-------------------------------------------------------------------------------------------------------------
PostgreSQL 16.1 on aarch64-unknown-linux-gnu, compiled by aarch64-unknown-linux-gnu-gcc (GCC) 9.5.0, 64-bit
version
-------------------------------------------------------------------------------------------------------------
PostgreSQL 16.1 on aarch64-unknown-linux-gnu, compiled by aarch64-unknown-linux-gnu-gcc (GCC) 9.5.0, 64-bit
My initial assumption is that this could potentially be attributed to a precision error. However, I'm reaching out to seek clarity on why this disparity is occurring and to explore potential strategies for mitigating it (as I want the behaviour to be consistent to regress test file).
Regards
Ayush Vatsa
On Wed, May 1, 2024 at 1:10 PM Ayush Vatsa <ayushvatsa1810@gmail.com> wrote: > I'm currently delving into Postgres HLL (HyperLogLog) functionality and have encountered an unexpected behavior while executingqueries from the "cumulative_add_sparse_edge.sql" regress test. This particular test data file involves three columns,with the last column representing an HLL (HyperLogLog) value derived from the previous HLL value and the currentraw value. > > Upon manual inspection of the query responsible for deriving the last row's HLL value, I noticed a discrepancy. When executingthe query: > """ > -- '\x148B481002....' is second last rows hll value > SELECT hll_add('\x148B481002.....', hll_hashval(2561)); > """ > instead of obtaining the expected value (''\x148B481002....''), I received a different output which is ('\x138b48000200410061008100a1........'). PostgreSQL has no function called hll_add or hll_hashval, and no regression test file called cumulative_add_sparse_edge.sql. A quick Google search suggests that these things are part of citusdata's fork of PostgreSQL, so you might want to contact them. -- Robert Haas EDB: http://www.enterprisedb.com