Re: pg9.6 segfault using simple query (related to use fk for join estimates)

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: pg9.6 segfault using simple query (related to use fk for join estimates)
Дата
Msg-id 48919afc-1993-8ca8-5b42-3949e9166e92@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: pg9.6 segfault using simple query (related to use fk for join estimates)  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

On 05/04/2016 11:02 PM, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, May 4, 2016 at 2:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> My other design-level complaint is that basing this on foreign keys is
>>> fundamentally the wrong thing.  What actually matters is the unique index
>>> underlying the FK; that is, if we have "a.x = b.y" and there's a
>>> compatible unique index on b.y, we can conclude that no A row will match
>>> more than one B row, whether or not an explicit FK relationship has been
>>> declared.  So we should drive this off unique indexes instead of FKs,
>>> first because we will find more cases, and second because the planner
>>> already examines indexes and doesn't need any additional catalog lookups
>>> to get the required data.  (IOW, the relcache additions that were made in
>>> this patch series should go away too.)
>
>> Without prejudice to anything else in this useful and detailed review,
>> I have a question about this.  A unique index proves that no A row
>> will match more than one B row, and I agree that deriving that from
>> unique indexes is sensible.  However, ISTM that an FK provides
>> additional information: we know that, modulo filter conditions on B,
>> every A row will match *exactly* one row B row, which can prevent us
>> from *underestimating* the size of the join product.  A unique index
>> can't do that.
>
> Very good point, but unless I'm missing something, that is not what the
> current patch does.  I'm not sure offhand whether that's an important
> estimation failure mode currently, or if it is whether it would be
> sensible to try to implement that rule entirely separately from the "at
> most one" aspect, or if it isn't sensible, whether that's a sufficiently
> strong reason to confine the "at most one" logic to working only with FKs
> and not with bare unique indexes.

FWIW it's a real-world problem with multi-column FKs. As David pointed 
out upthread, a nice example of this issue is Q9 in the TPC-H bench, 
where the underestimate leads to HashAggregate and then OOM failure.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kevin Grittner
Дата:
Сообщение: Re: what to revert
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: pg_dump dump catalog ACLs