auto_explain vs. parallel query

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема auto_explain vs. parallel query
Дата
Msg-id 3f62f24e-51b3-175c-9222-95f25fd2a9d6@2ndquadrant.com
обсуждение исходный текст
Ответы Re: auto_explain vs. parallel query
Список pgsql-hackers
Hi,

While debugging something on 9.6, I've noticed that auto_explain handles 
parallel queries in a slightly strange way - both the leader and all the 
workers log their chunk of the query (i.e. the leader logs explain for 
the whole query, while workers only log the parts they've been executing).

So for example for a query with 2 workers, the log look like this:
   2016-10-31 23:10:23.481 CET [12278] LOG:  duration: 32.562 ms  pl ...       Query Text: <parallel query>
                    ...       Parallel Seq Scan on tables  (cost=0.00..15158.25 rows=220 wi ...         Filter:
((table_datoid< '1000000000'::oid) AND (table_nspo ...         Rows Removed by Filter: 140614
  ...   2016-10-31 23:10:23.481 CET [12277] LOG:  duration: 32.325 ms  pl ...       Query Text: <parallel query>
                         ...       Parallel Seq Scan on tables  (cost=0.00..15158.25 rows=220 wi ...         Filter:
((table_datoid< '1000000000'::oid) AND (table_nspo ...         Rows Removed by Filter: 80948
  ...   2016-10-31 23:10:23.483 CET [12259] LOG:  duration: 38.997 ms  pl ...       Query Text: explain analyze select
*from tables where table_ ...       Gather  (cost=1000.00..16211.15 rows=529 width=356) (actual t ...         Workers
Planned:2                                          ...         Workers Launched: 2
  ...         ->  Parallel Seq Scan on tables  (cost=0.00..15158.25 rows= ...               Filter: ((table_datoid <
'1000000000'::oid)AND (tabl ...               Rows Removed by Filter: 105570                        ...
 

I'd say that's fairly surprising, and I haven't found any discussion 
about auto_explain vs. parallel queries in the archives (within the last 
year), so I assume this may not be entirely intentional.

Not only this does not match the output when running EXPLAIN ANALYZE 
manually, it also provides no information about which messages from 
workers belong to which "leader" message.

Another thing is that this interacts with sampling in a rather 
unfortunate way, because each process evaluates the sampling condition 
independently. So for example with auto_explain.sample_rate=0.5 a random 
subset of worker/leader explain plans will be logged.

But this also applies to the conditions in ExecutorStart, which enables 
instrumentation only when the query gets sampled - so when the leader 
gets sampled but not all the workers, the counters in the query logged 
by the leader are incomplete.

I do believe those are bugs that make auto_explain rather unusable with 
parallel queries. Adding IsParallelWorker() to the condition in 
ExecutorEnd should fix the extra logging messages (and log only from the 
leader).

Not sure how to fix the sampling - simply adding IsParallelWorker() to 
ExecutorStart won't do the trick, because we don't want to enable 
instrumentation only for sample queries. So I guess this needs to be 
decided in the leader, and communicated to the workers somehow.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: WAL consistency check facility
Следующее
От: Rahila Syed
Дата:
Сообщение: Re: Improvements in psql hooks for variables