Обсуждение: [GENERAL] Very suspicious plan difference for select and corresponding deletequeries PostgreSQL 9.6.2

Поиск
Список
Период
Сортировка
Hi,

I found a case with very curious plan difference between:

explain select * from jobs_to_delete WHERE EXISTS(SELECT 1 FROM job_stats_new_201411  jsm WHERE jsm.job_reference = jobs_to_delete.job_reference);
                                                                           QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=239386.25..376872.49 rows=111372 width=41)
   Hash Cond: ((jobs_to_delete.job_reference)::text = (jsm.job_reference)::text)
   ->  Seq Scan on jobs_to_delete  (cost=0.00..101547.10 rows=9286780 width=41)
   ->  Hash  (cost=237994.10..237994.10 rows=111372 width=18)
         ->  HashAggregate  (cost=236880.38..237994.10 rows=111372 width=18)
               Group Key: (jsm.job_reference)::text
               ->  Index Only Scan using job_stats_master_201411_job_reference_idx_ebs on job_stats_new_201411 jsm  (cost=0.56..214784.97 rows=8838161 width=18)

and corresponding delete (which I suspect should have the same plan)

explain delete from jobs_to_delete WHERE EXISTS(SELECT 1 FROM job_stats_new_201411  jsm WHERE jsm.job_reference = jobs_to_delete.job_reference);
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Delete on jobs_to_delete  (cost=266351.88..403838.13 rows=111372 width=12)
   ->  Hash Join  (cost=266351.88..403838.13 rows=111372 width=12)
         Hash Cond: ((jobs_to_delete.job_reference)::text = (jsm.job_reference)::text)
         ->  Seq Scan on jobs_to_delete  (cost=0.00..101547.10 rows=9286780 width=43)
         ->  Hash  (cost=264959.73..264959.73 rows=111372 width=24)
               ->  HashAggregate  (cost=263846.01..264959.73 rows=111372 width=24)
                     Group Key: (jsm.job_reference)::text
                     ->  Seq Scan on job_stats_new_201411 jsm  (cost=0.00..241750.61 rows=8838161 width=24)

Manual analyze of the both tables didn't change a result.
 
I can not see any possible/logical/realistic reason for the database to switch between INDEX ONLY SCAN and SEQ SCAN for EXIST part of query in this two cases.

I not sure that it's a but, so I better post in -general first.

--
Maxim Boguk
Senior Postgresql DBA
http://dataegret.com/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."


Maxim Boguk <maxim.boguk@gmail.com> writes:
> I can not see any possible/logical/realistic reason for the database to
> switch between INDEX ONLY SCAN and SEQ SCAN for EXIST part of query in this
> two cases.

If you'd done an EXPLAIN VERBOSE, you'd have noted that the scan was
returning ctid, which it could not get out of the index.  This is
necessary for possible EPQ rechecks.

            regards, tom lane