Re: Propose a new hook for mutating the query bounds

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Propose a new hook for mutating the query bounds
Дата
Msg-id 25145150-2e94-8eac-70f4-6bd4496064ed@enterprisedb.com
обсуждение исходный текст
Ответ на Propose a new hook for mutating the query bounds  (Xiaozhe Yao <askxzyao@gmail.com>)
Ответы Re: Propose a new hook for mutating the query bounds  (Xiaozhe Yao <askxzyao@gmail.com>)
Список pgsql-hackers

On 11/17/21 2:24 PM, Xiaozhe Yao wrote:
> Hi hackers,
> 
> I am currently working on improving the cardinality estimation component
> in PostgreSQL with machine learning. I came up with a solution that
> mutates the bounds for different columns. For example, assume that we
> have a query
> 
> ```
> select * from test where X<10 and Y<20;
> ```
> 
> Our approach tries to learn the relation between X and Y. For example,
> if we have a linear relation, Y=X+10. Then Y<20 is essentially
> equivalent to X<10. Therefore we can mutate the Y<20 to Y<INT_MAX so
> that the selectivity will be 1, and we will have a more accurate estimation.
> 

OK. FWIW the extended statistics patch originally included a patch for
multi-dimensional histograms, and that would have worked for this
example just fine, I guess. But yeah, there are various other
dependencies for which a histogram would not help. And ML might discover
that and help ...

> It seems to me that we can achieve something similar by mutating the
> pg_statistics, however, mutating the bounds is something more
> straightforward to me and less expensive.
> 

I don't understand how you could achieve this by mutating pg_statistic,
without also breaking estimation for queries that only have Y<20.

> I am wondering if it is possible to have such an extension? Or if there
> is a better solution to this? I have already implemented this stuff in a
> private repository, and if this is something you like, I can further
> propose the patch to the list.
> 

Maybe, but it's really hard to comment on this without seeing any PoC
patches. We don't know where you you'd like the hook called, what info
would it have access to, how would it tweak the selectivities etc.

If you think this would work, write a PoC patch and we'll see.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ekaterina Sokolova
Дата:
Сообщение: Re: RFC: Logging plan of the running query
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Patch: Range Merge Join