Re: Use zero for nullness estimates of system attributes

Поиск
Список
Период
Сортировка
От Jim Finnerty
Тема Re: Use zero for nullness estimates of system attributes
Дата
Msg-id 1548516385152-0.post@n3.nabble.com
обсуждение исходный текст
Ответ на Re: Use zero for nullness estimates of system attributes  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Use zero for nullness estimates of system attributes  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
It's related, but what I was referring to applies even to the uncorrelated
case: suppose you have something like:

select x, sum(z) 
from t
where
    x > 5 and y in ('a', 'b', 'c')
group by x;

let's say that 'a', 'b', and 'c' are not frequent values of y, so the
estimated selectivity is based on the n_distinct of y and the 3 values.  Now
imagine that x > 5 is applied first.  That reduces the number of qualifying
rows by the selectivity of (x > 5), but it may also reduce the number of
distinct values of y.  If it reduces the n_distinct of y, then the IN
predicate selectivity should be adjusted also.

The negative n_distinct representation (for n_distinct a large fraction of
numrows) already accounts for an automatic scaling of n_distinct when the
table size grows or shrinks.  Adjusting the n_distinct of y after applying
predicate (x > 5) is conceptually the same thing, except that the scaling is
not linear, but would be based on a (fairly cheap) probabilistic formula of
how many distinct y's can be expected to survive after applying predicate x,
rather than assuming that all of them do.

If you have correlation information for columns (x, y) you'd want to use
that information instead, but in the absence of that information you could
make this adjustment.

It might make a good intern project if somebody wants to investigate it.



-----
Jim Finnerty, AWS, Amazon Aurora PostgreSQL
--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Thread-unsafe coding in ecpg
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [Patch] Log10 and hyperbolic functions for SQL:2016 compliance