PATCH: add support for IN and @> in functional-dependency statistics use
От | Pierre Ducroquet |
---|---|
Тема | PATCH: add support for IN and @> in functional-dependency statistics use |
Дата | |
Msg-id | 13902317.Eha0YfKkKy@pierred-pdoc обсуждение исходный текст |
Ответы |
Re: PATCH: add support for IN and @> in functional-dependencystatistics use
|
Список | pgsql-hackers |
Hello At my current job, we have a lot of multi-tenant databases, thus with tables containing a tenant_id column. Such a column introduces a severe bias in statistics estimation since any other FK in the next columns is very likely to have a functional dependency on the tenant id. We found several queries where this functional dependency messed up the estimations so much the optimizer chose wrong plans. When we tried to use extended statistics with CREATE STATISTIC on tenant_id, other_id, we noticed that the current implementation for detecting functional dependency lacks two features (at least in our use case): - support for IN clauses - support for the array contains operator (that could be considered as a special case of IN) After digging in the source code, I think the lack of support for IN clauses is an oversight and due to the fact that IN clauses are ScalarArrayOpExpr instead of OpExpr. The attached patch fixes this by simply copying the code- path for OpExpr and changing the type name. It compiles and the results are correct, with a dependency being correctly taken into consideration when estimating rows. If you think such a copy paste is bad and should be factored in another static bool function, please say so and I will happily provide an updated patch. The lack of support for @> operator, on the other hand, seems to be a decision taken when writing the initial code, but I can not find any mathematical nor technical reason for it. The current code restricts dependency calculation to the = operator, obviously because inequality operators are not going to work... but array contains is just several = operators grouped, thus the same for the dependency calculation. The second patch refactors the operator check in order to also include array contains. I tested the patches on current HEAD, but I can test and provide back-ported versions of the patch for other versions if needed (this code path hardly changed since its introduction in 10). Best regards Pierre Ducroquet
Вложения
В списке pgsql-hackers по дате отправления: