Re: pb with join plan

Поиск

Список

Период

Сортировка

От	Kirk Wolak
Тема	Re: pb with join plan
Дата	27 июня 2023 г. 21:11:57
Msg-id	CACLU5mSRW1K+4Z_rfOkzaas5iTuXNvLELsTRXFfac7ut02wTRQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: pb with join plan (Marc Millas <marc.millas@mokadb.com>)
Ответы	Re: pb with join plan
Список	pgsql-general

Дерево обсуждения

On Wed, Jun 21, 2023 at 12:10 PM Marc Millas <marc.millas@mokadb.com> wrote:

Marc MILLAS
On Wed, Jun 21, 2023 at 12:43 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 6/21/23 00:26, Marc Millas wrote:
>
>
> On Tue, Jun 20, 2023 at 11:19 PM David Rowley <dgrowleyml@gmail.com
> <mailto:dgrowleyml@gmail.com>> wrote:
>
> On Wed, 21 Jun 2023 at 08:34, Marc Millas <marc.millas@mokadb.com
> <mailto:marc.millas@mokadb.com>> wrote:
> >
> > On Tue, Jun 20, 2023 at 10:14 PM David Rowley
> <dgrowleyml@gmail.com <mailto:dgrowleyml@gmail.com>> wrote:
> >>
> >> On Wed, 21 Jun 2023 at 07:42, Marc Millas <marc.millas@mokadb.com
> <mailto:marc.millas@mokadb.com>> wrote:
> >> > But if I do the same with clause one OR clause 2, I have to
> kill the request after an hour, seeing the filesystem showing more
> than 140 Mb of increased usage.

It's a bit weird the "victor" table is joined seemingly without any join
conditions, leading to a cross join (which massively inflates the cost
for joins above it). Maybe the anonymized plan mangles it somehow.

So I did try to simplify my pb.
I create a table with the result of the first 3 joins.
That table do have 15M lines. all tables have been vacuum analyze

Now if I do an explain analyze of a simple join between that table and my original table 4
using a simple = clause, I get a result in one second (around). and the planner guesses for rows seems in line with the observed values .
if I use a substr(table1.a)= table2.b, the explain analyze get a result in 21 seconds and the planner estimates a 65M rows result set while the observed is 330 k rows
so here its 20 times slower and the discrepency between planner rows guess and reality is a 200 ratio.

Now, if I try an explain analyze with join on a=b or substr(c)=d or e=f
then... I kill the query after a quarter an hour without any answer.
if I try to just explain the query, the planner rows guess becomes more than 2 Billions....
the extremely simple query and plan are here, without automatic obfuscation
https://explain.depesz.com/s/b8Ll

First, I am not sure why you cannot send us the explain analyze. But moving on...

substr() is a function that mutilates a value such that the index becomes useless...

If you are looking for the LEFT() of the value, then an INDEX can be used.

I have COLLATION "C" and when I query:

WHERE fld like fld_b||"%"

The optimizer constructs a query that uses the index on "fld"...

But when I try:

WHERE fld like CONCAT_WS("", fld_b,"%")
It doesn't use the index version. (because the function call is too complicated to see through)

When using functions in where clauses, indexes either have to be made on those functions, or often times the index cannot be used.

BTW, I noted the COLLATION. That turned out to be important, because my first DB test did NOT use that collation, and the result

of the LIKE was the non-indexed version...

I hope you find something useful in here.

Also, WHERE fld <> 72... (unless you have a heavily skewed set of statistics, I read that as. SCAN everything, and check later,

because this should filter very few rows), whereas fld = 72 will be blazingly fast.

Kirk

В списке pgsql-general по дате отправления:

Предыдущее

От: Ben Chobot
Дата: 27 июня 2023 г., 19:32:24
Сообщение: Re: typical active table count?

Следующее

От: Kirk Wolak
Дата: 27 июня 2023 г., 21:30:46
Сообщение: Re: bug or lacking doc hint

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: pb with join plan

Предыдущее

Следующее