Re: estimation problems for DISTINCT ON with FDW

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: estimation problems for DISTINCT ON with FDW
Дата
Msg-id 491032.1593537223@sss.pgh.pa.us
обсуждение исходный текст
Ответ на estimation problems for DISTINCT ON with FDW  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: estimation problems for DISTINCT ON with FDW  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Jeff Janes <jeff.janes@gmail.com> writes:
> It doesn't cost out the plan of pushing the DISTINCT ON down to the foreign
> side, which is probably the best way to run the query.  I guess it makes
> sense that FDW machinery in general doesn't want to try to push a
> PostgreSQL specific construct.

Well, that's an unimplemented feature anyway.  But people hared off after
that without addressing your actual bug report:

> But much worse than that, it horribly misestmates the number of unique rows
> it will get back, having never asked the remote side for an estimate of
> that.

I poked into that and found that the problem is in estimate_num_groups,
which effectively just disregards any relation that has rel->tuples = 0.
That is the case for a postgres_fdw foreign table if use_remote_estimate
is true, because postgres_fdw never bothers to set any other value.
(On the other hand, if use_remote_estimate is false, it does fill in a
pretty-bogus value, mainly so it can use set_baserel_size_estimates.
See postgresGetForeignRelSize.)

It seems like we could make estimate_num_groups a bit more robust here;
it could just skip its attempts to clamp based on total size or
restriction selectivity, but still include the reldistinct value for the
rel into the total numdistinct.  I wonder though if this is the only
problem caused by failing to fill in any value for rel->tuples ...
should we make postgres_fdw install some value for that?

(Note that the question of whether we should ask the remote server for
an estimate of ndistinct is kind of orthogonal to any of these points.
Even if we had obtained one that way, estimate_num_groups would not pay
any attention to it without a fix for the point at hand.)

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: POC: postgres_fdw insert batching
Следующее
От: Peter Eisentraut
Дата:
Сообщение: SQL-standard function body