Re: Proposal: add a debug message about using geqo

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: Proposal: add a debug message about using geqo
Дата
Msg-id CAKFQuwafSDvN-U3uZwLS8tMCBHoXyWTRqtevWO8637e3kihTFA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Proposal: add a debug message about using geqo  (Jacob Champion <jchampion@timescale.com>)
Список pgsql-hackers
On Fri, Jul 22, 2022 at 1:20 PM Jacob Champion <jchampion@timescale.com> wrote:
On Wed, Jun 1, 2022 at 11:09 PM KAWAMOTO Masaya <kawamoto@sraoss.co.jp> wrote:
> That sounds a nice idea. But I don't think that postgres shows in the
> EXPLAIN output why the plan is selected. Would it be appropriate to
> show that GEQO is used in EXPLAIN output?

I'm reminded of Greenplum's "Optimizer" line in its EXPLAIN output
[1], so from that perspective I think it's intuitive.

> As a test, I created a patch that add information about GEQO to
> EXPLAIN output by the GEQO option. The output example is as follows.
> What do you think about the location and content of information about GEQO? 

I am a little surprised to see GeqoDetails being printed for a plan
that didn't use GEQO, but again that's probably because I'm used to
GPDB's Optimizer output. And I don't have a lot of personal experience
using alternative optimizers.

I agree this should be part of explain output.

I would not print the current value of geqo_threshold and leave setting display the exclusive purview of the settings option.

The presentation of only a single geqo result seems incorrect given that multiple trees can exist.  In the first example below the full outer join causes 3 relations to be seen as a single relation at the top level (hence max join nodes = 4) while in the inner join case we see all 6 join nodes.  There should be two outputs of GEQO in the first explain, one with join nodes of 3 and the existing one with 4.

I also don't see the point of labelling them "max"; "join nodes" seems sufficient.

While it can probably be figured out from the rest of the plan, listing the names of the join nodes may be useful (and give join nodes some company).

David J.

postgres=# explain (verbose, geqo) with gs2 (v2) as materialized ( select * from generate_series(1,1) ) select * from gs2 as gs4 full outer join (select gs2a.v2 from gs2 as gs2a, gs2 as gs2b) as gs5 using (v2), generate_series(1, 1) as gs (v1) cross join gs2 as gs3 where v1 IN (select v2 from gs2);
                                             QUERY PLAN                                            
----------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.07..0.21 rows=1 width=12)
   Output: COALESCE(gs4.v2, gs2a.v2), gs.v1, gs3.v2
   CTE gs2
     ->  Function Scan on pg_catalog.generate_series  (cost=0.00..0.01 rows=1 width=4)
           Output: generate_series.generate_series
           Function Call: generate_series(1, 1)
   ->  Nested Loop  (cost=0.06..0.16 rows=1 width=12)
         Output: gs.v1, gs4.v2, gs2a.v2
         ->  Nested Loop  (cost=0.02..0.06 rows=1 width=4)
               Output: gs.v1
               Join Filter: (gs.v1 = gs2.v2)
               ->  Function Scan on pg_catalog.generate_series gs  (cost=0.00..0.01 rows=1 width=4)
                     Output: gs.v1
                     Function Call: generate_series(1, 1)
               ->  HashAggregate  (cost=0.02..0.03 rows=1 width=4)
                     Output: gs2.v2
                     Group Key: gs2.v2
                     ->  CTE Scan on gs2  (cost=0.00..0.02 rows=1 width=4)
                           Output: gs2.v2
         ->  Hash Full Join  (cost=0.03..0.10 rows=1 width=8)
               Output: gs4.v2, gs2a.v2
               Hash Cond: (gs2a.v2 = gs4.v2)
               ->  Nested Loop  (cost=0.00..0.05 rows=1 width=4)
                     Output: gs2a.v2
                     ->  CTE Scan on gs2 gs2b  (cost=0.00..0.02 rows=1 width=0)
                           Output: gs2b.v2
                     ->  CTE Scan on gs2 gs2a  (cost=0.00..0.02 rows=1 width=4)
                           Output: gs2a.v2
               ->  Hash  (cost=0.02..0.02 rows=1 width=4)
                     Output: gs4.v2
                     ->  CTE Scan on gs2 gs4  (cost=0.00..0.02 rows=1 width=4)
                           Output: gs4.v2
   ->  CTE Scan on gs2 gs3  (cost=0.00..0.02 rows=1 width=4)
         Output: gs3.v2
 GeqoDetails: GEQO: used, geqo_threshold: 2, Max join nodes: 4
(35 rows)

postgres=# explain (verbose, geqo) with gs2 (v2) as materialized ( select * from generate_series(1,1) ) select * from gs2 as gs4 join (select gs2a.v2 from gs2 as gs2a, gs2 as gs2b) as gs5 using (v2), generate_series(1, 1) as gs (v1) cross join gs2 as gs3 where v1 IN (select v2 from gs2);
                                                QUERY PLAN                                                
----------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.02..0.18 rows=1 width=12)
   Output: gs4.v2, gs.v1, gs3.v2
   CTE gs2
     ->  Function Scan on pg_catalog.generate_series  (cost=0.00..0.01 rows=1 width=4)
           Output: generate_series.generate_series
           Function Call: generate_series(1, 1)
   ->  Nested Loop  (cost=0.00..0.14 rows=1 width=12)
         Output: gs.v1, gs4.v2, gs3.v2
         ->  Nested Loop  (cost=0.00..0.11 rows=1 width=8)
               Output: gs.v1, gs4.v2
               ->  Nested Loop Semi Join  (cost=0.00..0.04 rows=1 width=4)
                     Output: gs.v1
                     Join Filter: (gs.v1 = gs2.v2)
                     ->  Function Scan on pg_catalog.generate_series gs  (cost=0.00..0.01 rows=1 width=4)
                           Output: gs.v1
                           Function Call: generate_series(1, 1)
                     ->  CTE Scan on gs2  (cost=0.00..0.02 rows=1 width=4)
                           Output: gs2.v2
               ->  Nested Loop  (cost=0.00..0.05 rows=1 width=4)
                     Output: gs4.v2
                     Join Filter: (gs2a.v2 = gs4.v2)
                     ->  CTE Scan on gs2 gs2a  (cost=0.00..0.02 rows=1 width=4)
                           Output: gs2a.v2
                     ->  CTE Scan on gs2 gs4  (cost=0.00..0.02 rows=1 width=4)
                           Output: gs4.v2
         ->  CTE Scan on gs2 gs3  (cost=0.00..0.02 rows=1 width=4)
               Output: gs3.v2
   ->  CTE Scan on gs2 gs2b  (cost=0.00..0.02 rows=1 width=0)
         Output: gs2b.v2
 GeqoDetails: GEQO: used, geqo_threshold: 2, Max join nodes: 6
(30 rows)


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: fairywren hung in pg_basebackup tests