Re: estimation problems for DISTINCT ON with FDW

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: estimation problems for DISTINCT ON with FDW
Дата
Msg-id 583486.1593614409@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: estimation problems for DISTINCT ON with FDW  (Etsuro Fujita <etsuro.fujita@gmail.com>)
Ответы Re: estimation problems for DISTINCT ON with FDW  (Etsuro Fujita <etsuro.fujita@gmail.com>)
Список pgsql-hackers
Etsuro Fujita <etsuro.fujita@gmail.com> writes:
> On Wed, Jul 1, 2020 at 7:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> +    baserel->tuples = Max(baserel->tuples, baserel->rows);

> for consistency, this should be
>   baserel->tuples = clamp_row_est(baserel->rows / sel);
> where sel is the selectivity of the baserestrictinfo clauses?

If we had the selectivity available, maybe so, but we don't.
(And even less so if we put this logic in the core code.)

Short of sending a whole second query to the remote server, it's
not clear to me how we could get the full table size (or equivalently
the target query's selectivity for that table).  The best we realistically
can do is to adopt pg_class.reltuples if there's been an ANALYZE of
the foreign table.  That case already works (and this proposal doesn't
break it).  The problem is what to do when pg_class.reltuples is zero
or otherwise badly out-of-date.

>> What I'm sort of inclined to do is neither of these exactly, but
>> instead put the
>>     baserel->tuples = Max(baserel->tuples, baserel->rows);
>> clamping behavior into the core code, immediately after the call to
>> GetForeignRelSize.  This'd still let the FDW set baserel->tuples if
>> it has a mind to, while not requiring that; and it prevents the
>> situation where the rows and tuples estimates are inconsistent.

> I'm not sure this would address the inconsistency.  Consider the
> postgres_fdw case where use_remote_estimate is true, and the stats are
> out of date, eg, baserel->tuples copied from pg_class is much larger
> than the actual tuples and hence baserel->rows (I assume here that
> postgres_fdw doesn't do anything about baserel->tuples).  In such a
> case the inconsistency would make the estimate_num_groups() estimate
> more inaccurate.  I think the consistency is the responsibility of the
> FDW rather than the core, so I would vote for the 0002 patch.  Maybe
> I'm missing something.

Nothing about this proposal is stopping the FDW from inserting a better
value for rel->tuples if it's got one.  But it's not necessarily easy
or cheap to get that info.  In any case I think that upgrading the
requirements for what GetForeignRelSize must set is a hard sell.
We certainly could not back-patch a fix that required that, and even
going forward, it seems likely that many FDWs would never get the word.
(Well, maybe we could force the issue by throwing an error if
rel->tuples < rel->rows after GetForeignRelSize, but it's not hard
to imagine that routine testing could fail to trigger such a check.)

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: SQL-standard function body
Следующее
От: Laurenz Albe
Дата:
Сообщение: Re: Remove Deprecated Exclusive Backup Mode