Re: huge discrepancy between EXPLAIN cost and actual time (but the table has just been ANALYZED)

Поиск
Список
Период
Сортировка
От Kirk Wolak
Тема Re: huge discrepancy between EXPLAIN cost and actual time (but the table has just been ANALYZED)
Дата
Msg-id CACLU5mT37ZE4S9AJYwHs5aZ6rmRPobNERpAgzuHX1u1otJWqLQ@mail.gmail.com
обсуждение исходный текст
Ответ на huge discrepancy between EXPLAIN cost and actual time (but the table has just been ANALYZED)  (Kent Tong <kent.tong.mo@gmail.com>)
Список pgsql-general
On Mon, May 8, 2023 at 8:29 AM Kent Tong <kent.tong.mo@gmail.com> wrote:
Hi,

I have a complex query involving over 15 joins and a CTE query and it takes over 17s to complete. The output of EXPLAIN ANALYZE includes (somewhere deep inside):

Index Scan using document_pkey on document document0_  (cost=0.29..8.31 rows=1 width=3027) (actual time=16243.959..16243.961 rows=1 loops=1)

This shows an index scan with a very small cost but a very large actual time. The strange thing is, all the tables have just been analyzed with the ANALYZE command (it is not a foreign table). Furthermore, if I run a simple query using that index, both the cost and the actual time are small.

Another snippet is:
                                                                                                                                                        -> CTE Scan on all_related_document p  (cost=1815513.32..3030511.77 rows=241785 width=16) (actual time=203.969..203.976 rows=0 loops=1) 

I think the cost-actual time discrepancy is fine as it is a recursive CTE so postgresql can't estimate the cost well. It is materialized and a full table scan is performed. However, the actual time is not that bad.  Also, the estimated rows and the actual rows are also vastly different, but I guess this is fine, isn't it? 

Any idea how I should check further?

Many thanks in advance

--
Kent Tong
IT author and consultant, child education coach

Kent,
  I had a really slow CTE based query a while back.  The problem went away when I materialized the query.
In our case, we had carried this query from another big DB, and used the CTE query to force it to be evaluated once.
It is known to be slow, and about 1% of the size of the other tables we were operating on.

  PG, by default, optimized this, and inlined it as a subquery.  Causing a massive slowdown.  Materialize fixed it right up.
That would be a quick test. 

  Outside of that, I agree with Tom... it's really hard to help without full details.

>>> FWIW, if I had my way, THIS would work:
EXPLAIN (ANALYZE, BUFFERS, SETTINGS, RELATIONS_DDL, SHOW_QUERY) <query>;

Would Dump the table structures of all involved tables/views/indexes.  It would repeat the query.
And in a perfect world, it would show you the "rewritten" query (now I am dreaming).
All with the plan...

But it will take a while to get the level of detail to come out...  But the AIs/ML will go crazy with it!

This way you just copy that output and share it.  (Eventually...)

Anyways, in the meantime, a query and the table structure/row counts would be nice.

Kirk...

В списке pgsql-general по дате отправления:

Предыдущее
От: Kirk Wolak
Дата:
Сообщение: Re: "PANIC: could not open critical system index 2662" - twice
Следующее
От: FOUTE K. Jaurès
Дата:
Сообщение: Re: Materialized Views - Way to refresh automatically (Incrementaly)