Re: PATCH: add support for IN and @> in functional-dependencystatistics use

Поиск

Список

Период

Сортировка

От	Dean Rasheed
Тема	Re: PATCH: add support for IN and @> in functional-dependencystatistics use
Дата	12 марта 2020 г. 13:25:41
Msg-id	CAEZATCXnQtw-ZytE+rp1Uby9KMs0hWim-unMa-nMdmS5MMgtbQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: PATCH: add support for IN and @> in functional-dependencystatistics use (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы	Re: PATCH: add support for IN and @> in functional-dependencystatistics use
Список	pgsql-hackers

Дерево обсуждения

[ For the sake of the archives, some of the discussion on the other
thread [1-3] should really have been on this thread. ]

On Sun, 2 Feb 2020 at 18:41, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> I think the challenge here is in applying the functional dependency
> computed for the whole array to individual elements. I'm not sure we can
> do that.
>
> For example, with a table like this:
>
>    CREATE TABLE t (a int, b int[]);
>    CREATE STATISTICS s (dependencies) ON a, b FROM t;
>
> Let's say the functional dependency is "perfect" i.e. has strength 1.0.
> But that only tells us dependency for complete array values, we don't
> know how much information we gain by knowledge of subset of the values.
>

The more I think about this example, the more I think this is really
just a special case of the more general problem of compatibility of
clauses. Once you add support for IN (...) clauses, any query of the
form

  SELECT ... WHERE (any clauses on col a) AND (any clauses on col b)

can be recast as

  SELECT ... WHERE a IN (...) AND b IN (...)

so any counter-example with bad estimates produced with a query in the
first form can also be written in the second form.

I think we should really be thinking in terms of making a strong
functional dependency (a => b) applicable generally to queries in the
first form, which will work well if the clauses on b are compatible
with those on b, but not if they're incompatible. However, that's not
so very different from the current state without extended stats, which
assumes independence, and will return poor estimates if the
columns/clauses aren't independent.

So I'd be tempted to apply a tidied up version of the patch from [3],
and then lift all restrictions from dependency_is_compatible_clause(),
other than the requirement that the clause refer to a single variable.

Regards,
Dean

[1] https://www.postgresql.org/message-id/CAEZATCXaNFZyOhR4XXAfkvj1tibRBEjje6ZbXwqWUB_tqbH%3Drw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/20200309181915.5lxhuw2qxoihfoqo%40development
[3] https://www.postgresql.org/message-id/CAEZATCUic8PwhTnexC%2BUx-Z_e5MhWD-8jk%3DJ1MtnVW8TJD%2BVHw%40mail.gmail.com

В списке pgsql-hackers по дате отправления:

Предыдущее

От: "曾文旌(义从)"
Дата: 12 марта 2020 г., 13:06:19
Сообщение: Re: [Proposal] Global temporary tables

Следующее

От: Julien Rouhaud
Дата: 12 марта 2020 г., 13:31:15
Сообщение: Re: Planning counters in pg_stat_statements (using pgss_store)

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: PATCH: add support for IN and @> in functional-dependencystatistics use

Предыдущее

Следующее