During investigation of weird app behaviour I found very strange results with sorting/dising of some dataset. Issue was reproduced on 12.7 server and 13.3 my local laptop. Dataset table dump (25MB file) available per request.
QUERY: WITH t1 AS ( SELECT ad_id, advertiser_id, campaign_id, campaign_type_id, ad_group_id, target_id, device_type_id, country_code, block_format_type_id, category_group_id, array_agg(geo_id) AS geo_ids FROM test GROUP BY ad_id, advertiser_id, campaign_id, campaign_type_id, ad_group_id, target_id, device_type_id, country_code, block_format_type_id, category_group_id ), t2 AS ( SELECT DISTINCT ad_group_id, country_code, device_type_id, target_id, block_format_type_id, category_group_id, geo_ids FROM t1 )
SELECT COUNT(*) FROM t2;
Test: set jit to off; set max_parallel_workers_per_gather to 0; set work_mem to 4MB; count ------- 83921
set work_mem to '64MB'; count ------- 55634
...
Correnct answers provided with larger work_mem (55634)
What make situatio even more curious that disable hash_agg doesn't make issue gone: set enable_hashagg to 0; set work_mem to '4MB'; ... count ------- 83700
...
Do you get different results if you provide an ORDER BY in the aggregate array_agg() function?
Without an ORDER BY there, I don't think there is any guarantee for deterministic results.