Re: Multitenancy optimization

Поиск
Список
Период
Сортировка
От Hadi Moshayedi
Тема Re: Multitenancy optimization
Дата
Msg-id CAK=1=WpT2OuoD7nNF=m436WQOE-62XfSFw4GUnGUOb++KJFAyg@mail.gmail.com
обсуждение исходный текст
Ответ на Multitenancy optimization  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Ответы Re: Multitenancy optimization
Список pgsql-hackers
On Thu, Mar 28, 2019 at 5:40 AM Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
Certainly it is possible to create multicolumn statistics to notify
Postgres about columns correlation.
But unfortunately it is not good and working solution.

First of all we have to create multicolumn statistic for all possible
combinations of table's attributes including "tenant_id".
It is very inconvenient and inefficient.
 
On the inconvenient part: doesn't postgres itself automatically create functional dependencies on combinations? i.e. it seems to me if we create statistics on (a, b, c), then we don't need to create statistics on (a, b) or (a, c) or (b, c), because the pg_statistic_ext entry for (a, b, c) already includes enough information.

On the inefficient part, I think there's some areas of improvement here. For example, if (product_id) -> seller_id correlation is 1.0, then (product_id, product_name) -> seller_id correlation is definitely 1.0 and we don't need to store it. So we can reduce the amount of information stored in pg_statistic_ext -> stxdependencies, without losing any data points.

More generally, if (a) -> b correlation is X, then (a, c) -> b correlation is >= X. Maybe we can have a threshold to reduce number of entries in pg_statistic_ext -> stxdependencies.

-- Hadi

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Tsunakawa, Takayuki"
Дата:
Сообщение: RE: Libpq support to connect to standby server as priority
Следующее
От: "Nagaura, Ryohei"
Дата:
Сообщение: RE: Timeout parameters