Re: BUG #17084: Wrong results of distinct node.

Поиск
Список
Период
Сортировка
От Pantelis Theodosiou
Тема Re: BUG #17084: Wrong results of distinct node.
Дата
Msg-id CAE3TBxwq8qxLRXZhjaV-WeZ=xmi9JEsEV0h8EUH2YLDzg_UYyA@mail.gmail.com
обсуждение исходный текст
Ответ на BUG #17084: Wrong results of distinct node.  (PG Bug reporting form <noreply@postgresql.org>)
Список pgsql-bugs


On Tue, Jul 6, 2021 at 4:09 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      17084
Logged by:          Maxim Boguk
Email address:      maxim.boguk@gmail.com
PostgreSQL version: 13.3
Operating system:   Linux
Description:       

During investigation of weird app behaviour I found very strange results
with sorting/dising of some dataset.
Issue was reproduced on 12.7 server and 13.3 my local laptop.
Dataset table dump (25MB file) available per request.

QUERY:
WITH t1 AS (
         SELECT
            ad_id,
            advertiser_id,
            campaign_id,
            campaign_type_id,
            ad_group_id,
            target_id,
            device_type_id,
            country_code,
            block_format_type_id,
            category_group_id,
            array_agg(geo_id) AS geo_ids
          FROM test
          GROUP BY
            ad_id,
            advertiser_id,
            campaign_id,
            campaign_type_id,
            ad_group_id,
            target_id,
            device_type_id,
            country_code,
            block_format_type_id,
            category_group_id
),
t2 AS (
SELECT
DISTINCT
  ad_group_id,
  country_code,
  device_type_id,
  target_id,
  block_format_type_id,
  category_group_id,
  geo_ids
FROM  t1
)

SELECT COUNT(*) FROM t2;


Test:
set jit to off;
set max_parallel_workers_per_gather to 0;
set work_mem to 4MB;
 count
-------
 83921

set work_mem to '64MB';
 count
-------
 55634

...


Correnct answers provided with larger work_mem (55634)

What make situatio even more curious that disable hash_agg doesn't make
issue gone:
set enable_hashagg to 0;
set work_mem to '4MB';
...
 count
-------
 83700

...

Do you get different results if you provide an ORDER BY in the aggregate array_agg()  function?
Without an ORDER BY there, I don't think there is any guarantee for deterministic results.

Best regards,

Pantelis Theodosiou

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #17083: [PATCH] PostgreSQL fails to build with OpenLDAP 2.5.x
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #17084: Wrong results of distinct node.