Re: pg_stat_statements and "IN" conditions

Поиск
Список
Период
Сортировка
От Dmitry Dolgov
Тема Re: pg_stat_statements and "IN" conditions
Дата
Msg-id 20230217154643.2n3atmjnenchtgud@erthalion.local
обсуждение исходный текст
Ответ на Re: pg_stat_statements and "IN" conditions  (Dmitry Dolgov <9erthalion6@gmail.com>)
Список pgsql-hackers
> On Thu, Feb 09, 2023 at 08:43:29PM +0100, Dmitry Dolgov wrote:
> > On Thu, Feb 09, 2023 at 06:26:51PM +0100, Alvaro Herrera wrote:
> > On 2023-Feb-09, Dmitry Dolgov wrote:
> >
> > > > On Thu, Feb 09, 2023 at 02:30:34PM +0100, Peter Eisentraut wrote:
> >
> > > > What is the point of making this a numeric setting?  Either you want
> > > > to merge all values or you don't want to merge any values.
> > >
> > > At least in theory the definition of "too many constants" is different
> > > for different use cases and I see allowing to configure it as a way of
> > > reducing the level of surprise here.
> >
> > I was thinking about this a few days ago and I agree that we don't
> > necessarily want to make it just a boolean thing; we may want to make it
> > more complex.  One trivial idea is to make it group entries in powers of
> > 10: for 0-9 elements, you get one entry, and 10-99 you get a different
> > one, and so on:
> >
> > # group everything in a single bucket
> > const_merge_threshold = true / yes / on
> >
> > # group 0-9, 10-99, 100-999, 1000-9999
> > const_merge_treshold = powers
> >
> > Ideally the value would be represented somehow in the query text. For
> > example
> >
> >                          query                            | calls
> > ----------------------------------------------------------+-------
> >  select * from test where i in ({... 0-9 entries ...})    |     2
> >  select * from test where i in ({... 10-99 entries ...})  |     1
> >
> > What do you think?  The jumble would have to know how to reduce all
> > values within each power-of-ten group to one specific value, but I don't
> > think that should be particularly difficult.
>
> Yeah, it sounds appealing and conveniently addresses the question of
> losing the information about how many constants originally were there.
> Not sure if the example above would be the most natural way to represent
> it in the query text, but otherwise I'm going to try implementing this.
> Stay tuned.

It took me couple of evenings, here is what I've got:

* The representation is not that far away from your proposal, I've
  settled on:

    SELECT * FROM test_merge WHERE id IN (... [10-99 entries])

* To not reinvent the wheel, I've reused decimalLenght function from
  numutils, hence one more patch to make it available to reuse.

* This approach resolves my concerns about letting people tuning
  the behaviour of merging, as now it's possible to distinguish between
  calls with different number of constants up to the power of 10. So
  I've decided to simplify the configuration and make the guc boolean to
  turn it off or on.

* To separate queries with constants falling into different ranges
  (10-99, 100-999, etc), the order of magnitude is added into the jumble
  hash.

* I've incorporated feedback from Sergei and David, as well as tried to
  make comments and documentation more clear.

Any feedback is welcomed, thanks!

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dmitry Dolgov
Дата:
Сообщение: Re: pg_stat_statements and "IN" conditions
Следующее
От: Karina Litskevich
Дата:
Сообщение: Re: Possible false valgrind error reports