NOT IN subquery optimization

Поиск
Список
Период
Сортировка
От Jim Finnerty
Тема NOT IN subquery optimization
Дата
Msg-id 1550706289606-0.post@n3.nabble.com
обсуждение исходный текст
Ответы Re: NOT IN subquery optimization  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
The semantics of NOT IN (SELECT ...) are subtly different from the semantics
of NOT EXISTS (SELECT ...).  These differences center on how NULLs are
treated, and in general can result in statements that are harder to optimize
and slower to execute than the apparently similar NOT EXISTS statement.

A little over a year ago, Christian Antognini authored the blog "/How Well a
Query Optimizer Handles Subqueries?/" summarizing his findings about the
performance of PostgreSQL, MySQL, and Oracle on various subqueries:

   
https://antognini.ch/2017/12/how-well-a-query-optimizer-handles-subqueries/

His position was that you can classify the optimizations as correct or
incorrect, and based on that he provided the following comparison summary
(see below).  In short, PostgreSQL was the worst of the three systems:

    "Summary

        The number of queries that the query optimizers handle correctly are
the following:

        Oracle Database 12.2: 72 out of 80
        MySQL 8.0.3: 67 out of 80
        PostgreSQL 10.0: 60 out of 80

    Since not all queries are handled correctly, for best performance it is
sometimes necessary to rewrite them."

The subqueries that were found to be optimized "incorrectly" were almost
entirely due to poor or absent NOT IN subquery optimization.

The PostgreSQL community has been aware of the deficiencies in NOT IN
optimization for quite some time.  Based on an analysis of
psgsql-performance posts between 2013 and 2015, Robert Haas identified NOT
IN optimization as one of the common root causes of performance problems.

We have been working on improved optimization of NOT IN, and we would like
to share this optimizaton with the community.  With respect to the test
cases mentioned in the blog post mentioned above, it will elevate PostgreSQL
from "worst" to "first".  Generally the performance gains are large when the
optimization applies, though we have found one test case where performance
is worse.  We are investigating this now to see if we can disable the
optimization in that case.

We would like to include a patch for this change in the current commitfest. 
This thread can be used to track comments about this optimization.




-----
Jim Finnerty, AWS, Amazon Aurora PostgreSQL
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: WAL insert delay settings
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: WAL insert delay settings