RE: BUG #15922: Simple select with multiple exists filters returnsduplicates from a primary key field

Поиск
Список
Период
Сортировка
От David Raymond
Тема RE: BUG #15922: Simple select with multiple exists filters returnsduplicates from a primary key field
Дата
Msg-id VI1PR07MB5792A233AE1440BF45BBAAF487C60@VI1PR07MB5792.eurprd07.prod.outlook.com
обсуждение исходный текст
Ответ на Re: BUG #15922: Simple select with multiple exists filters returns duplicates from a primary key field  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #15922: Simple select with multiple exists filters returns duplicates from a primary key field  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
The cost tweaks let me slash the size right down to where the dump is 150 kB, which I've attached. Don't know if the
mailinglist will strip it, but you're also directly in the To: list.
 

After the load here're the basic commands I ran to show the issue:


set search_path to bug_test;
select 'n' as table_name, count(*) as record_count from n union all select 'n2a', count(*) from n2a union all select
'n2h',count(*) from n2h;
 
select count(*), count(distinct id) from n where exists(select 1 from n2h where n_id = n.id) and exists(select 1 from
n2awhere n_id = n.id and a_id = '23ce1ae4-eb99-486a-924e-2fe74180a884');
 
analyze n, n2a, n2h;
select count(*), count(distinct id) from n where exists(select 1 from n2h where n_id = n.id) and exists(select 1 from
n2awhere n_id = n.id and a_id = '23ce1ae4-eb99-486a-924e-2fe74180a884');
 
set min_parallel_table_scan_size to 0;
set min_parallel_index_scan_size to 0;
set parallel_setup_cost to 1;
select count(*), count(distinct id) from n where exists(select 1 from n2h where n_id = n.id) and exists(select 1 from
n2awhere n_id = n.id and a_id = '23ce1ae4-eb99-486a-924e-2fe74180a884');
 
explain analyze verbose select count(*), count(distinct id) from n where exists(select 1 from n2h where n_id = n.id)
andexists(select 1 from n2a where n_id = n.id and a_id = '23ce1ae4-eb99-486a-924e-2fe74180a884');
 



mnr=> set search_path to bug_test;
SET
Time: 0.299 ms
mnr=> select 'n' as table_name, count(*) as record_count from n union all select 'n2a', count(*) from n2a union all
select'n2h', count(*) from n2h;
 
 table_name | record_count
------------+--------------
 n          |          413
 n2a        |        1,738
 n2h        |          109
(3 rows)

Time: 5.241 ms
mnr=> select count(*), count(distinct id) from n where exists(select 1 from n2h where n_id = n.id) and exists(select 1
fromn2a where n_id = n.id and a_id = '23ce1ae4-eb99-486a-924e-2fe74180a884');
 
 count | count
-------+-------
    10 |    10
(1 row)

Time: 2.651 ms
mnr=> analyze n, n2a, n2h;
ANALYZE
Time: 4.590 ms
mnr=> select count(*), count(distinct id) from n where exists(select 1 from n2h where n_id = n.id) and exists(select 1
fromn2a where n_id = n.id and a_id = '23ce1ae4-eb99-486a-924e-2fe74180a884');
 
 count | count
-------+-------
    10 |    10
(1 row)

Time: 1.909 ms
mnr=> set min_parallel_table_scan_size to 0;
SET
Time: 0.181 ms
mnr=> set min_parallel_index_scan_size to 0;
SET
Time: 0.131 ms
mnr=> set parallel_setup_cost to 1;
SET
Time: 0.195 ms
mnr=> select count(*), count(distinct id) from n where exists(select 1 from n2h where n_id = n.id) and exists(select 1
fromn2a where n_id = n.id and a_id = '23ce1ae4-eb99-486a-924e-2fe74180a884');
 
 count | count
-------+-------
    15 |    10
(1 row)

Time: 154.525 ms
mnr=> explain analyze verbose select count(*), count(distinct id) from n where exists(select 1 from n2h where n_id =
n.id)and exists(select 1 from n2a where n_id = n.id and a_id = '23ce1ae4-eb99-486a-924e-2fe74180a884');
 
                                                                     QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=24.09..24.10 rows=1 width=16) (actual time=186.389..186.389 rows=1 loops=1)
   Output: count(*), count(DISTINCT n.id)
   ->  Gather  (cost=8.30..24.06 rows=6 width=16) (actual time=2.254..196.932 rows=15 loops=1)
         Output: n.id
         Workers Planned: 2
         Workers Launched: 2
         ->  Nested Loop  (cost=7.30..22.46 rows=2 width=16) (actual time=0.277..0.310 rows=5 loops=3)
               Output: n.id
               Inner Unique: true
               Worker 0: actual time=0.180..0.180 rows=0 loops=1
               Worker 1: actual time=0.412..0.412 rows=0 loops=1
               ->  Parallel Hash Join  (cost=7.03..20.33 rows=3 width=32) (actual time=0.268..0.284 rows=5 loops=3)
                     Output: n2a.n_id, n2h.n_id
                     Hash Cond: (n2a.n_id = n2h.n_id)
                     Worker 0: actual time=0.179..0.179 rows=0 loops=1
                     Worker 1: actual time=0.412..0.413 rows=0 loops=1
                     ->  Parallel Bitmap Heap Scan on bug_test.n2a  (cost=4.59..17.80 rows=17 width=16) (actual
time=0.037..0.061rows=40 loops=1)
 
                           Output: n2a.n_id, n2a.a_id
                           Recheck Cond: (n2a.a_id = '23ce1ae4-eb99-486a-924e-2fe74180a884'::uuid)
                           Heap Blocks: exact=13
                           ->  Bitmap Index Scan on n2a_a_id_n_id_idx  (cost=0.00..4.58 rows=40 width=0) (actual
time=0.025..0.025rows=40 loops=1)
 
                                 Index Cond: (n2a.a_id = '23ce1ae4-eb99-486a-924e-2fe74180a884'::uuid)
                     ->  Parallel Hash  (cost=1.64..1.64 rows=64 width=16) (actual time=0.069..0.069 rows=36 loops=3)
                           Output: n2h.n_id
                           Buckets: 1024  Batches: 1  Memory Usage: 40kB
                           Worker 0: actual time=0.052..0.053 rows=0 loops=1
                           Worker 1: actual time=0.092..0.092 rows=0 loops=1
                           ->  Parallel Seq Scan on bug_test.n2h  (cost=0.00..1.64 rows=64 width=16) (actual
time=0.014..0.030rows=109 loops=1)
 
                                 Output: n2h.n_id
               ->  Index Only Scan using n_pkey on bug_test.n  (cost=0.27..0.71 rows=1 width=16) (actual
time=0.004..0.004rows=1 loops=15)
 
                     Output: n.id
                     Index Cond: (n.id = n2h.n_id)
                     Heap Fetches: 15
 Planning Time: 1.201 ms
 Execution Time: 197.087 ms
(35 rows)

Time: 199.122 ms
mnr=>



David Raymond | Associate Quality Analyst - MPU Addressing | TomTom | Lebanon, NH, United States
e-mail: david.raymond@tomtom.com  | office +1 603 306 8498 | www.tomtom.com






-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us> 
Sent: Tuesday, July 23, 2019 5:50 PM
To: David Raymond <David.Raymond@tomtom.com>
Cc: pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #15922: Simple select with multiple exists filters returns duplicates from a primary key field

David Raymond <David.Raymond@tomtom.com> writes:
> Update so far: I did manage to go and replace all the UUIDs with random ones and it's still doing it, so I do have a
sanitizedversion now. No real luck with trimming down the record count though. When deleting too many records it would
changethe query plan to something not broken. Even after replacing the UUIDs and not deleting anything I ran analyze
andit came up clean, and I had to vacuum analyze for it to pick the broken plan again. (That example pasted below) The
dumpfile is at least consistently doing the same thing where immediately after load the plan chosen gives a consistent
answer,but once analyzed it gives the bad duplicates. As it stands the dump file is 130 MB (30MB zipped), is that too
bigto send in to you?
 

Given that the problem seems to be specific to parallel query, likely
the reason is that reducing the number of rows brings it below the
threshold where the planner wants to use parallel query.  So you could
probably reduce the parallel-query cost parameters to get a failure
with a smaller test case.  However, if you don't feel like doing that,
that's fine.

Please *don't* send a 30MB message to the whole list, but you can
send it to me privately.

            regards, tom lane



Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Manuel Rigger
Дата:
Сообщение: ADD CHECK fails for parent table if column used in CHECK is fully-qualified
Следующее
От: Christopher Browne
Дата:
Сообщение: Re: Request for resolution || Support