Re: pg_stat_statements: calls under-estimation propagation

Поиск
Список
Период
Сортировка
От samthakur74
Тема Re: pg_stat_statements: calls under-estimation propagation
Дата
Msg-id CABzZFEuj+fWpwJ8Gah7zciwysyQTdGHYbBMo0m=6uN1mmLwZUw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: pg_stat_statements: calls under-estimation propagation  (Fujii Masao <masao.fujii@gmail.com>)
Список pgsql-hackers



On Thu, Sep 19, 2013 at 11:32 AM, Fujii Masao-2 [via PostgreSQL] <[hidden email]> wrote:
On Thu, Sep 19, 2013 at 2:25 PM, samthakur74 <[hidden email]> wrote:

>>I got the segmentation fault when I tested the case where the
>> least-executed
>>query statistics is discarded, i.e., when I executed different queries more
>> than
>>pg_stat_statements.max times. I guess that the patch might have a bug.
> Thanks, will try to fix it.
>
>> >pg_stat_statements--1.1.sql should be removed.
>> Yes will do that
>
>
>>
>> >+      <entry><structfield>queryid</structfield></entry>
>> >+      <entry><type>bigint</type></entry>
>> >+      <entry></entry>
>> >+      <entry>Unique value of each representative statement for the
>> >current statistics session.
>> >+       This value will change for each new statistics session.</entry>
>>
>> >What does "statistics session" mean?
>> The time period when statistics are gathered by statistics collector
>> without being reset. So the statistics session continues across normal
>> shutdowns, but in case of abnormal situations like crashes, format upgrades
>> or statistics being reset for any other reason, a new time period of
>> statistics collection starts i.e. a new statistics session. The queryid
>> value generation is linked to statistics session so emphasize the fact that
>> in case of crashes,format upgrades or any situation of statistics reset, the
>> queryid for the same queries will also change.
>I'm afraid that this behavior narrows down the use case of queryid very much.
>For example, since the queryid of the same query would not be the same in
>the master and the standby servers, we cannot associate those two statistics
>by using the queryid. The queryid changes through the crash recovery, so
>we cannot associate the query statistics generated before the crash with that
>generated after the crash recovery even if the query is the same.

   Yes, these are limitations in this approach. The other approaches suggested included
1. Expose query id hash value as is, in the view, but document the fact that it will be unstable between releases
2. Expose query id hash value via an undocumented function and let more expert users decided if they want to use it.

The approach of using statistics session id to generate queryid is a compromise between not exposing it at all and exposing it without warning the users of unstable hash value from query tree between releases. 
 
>This is not directly related to the patch itself, but why does the queryid
>need to be calculated based on also the "statistics session"?
  If we expose hash value of query tree, without using statistics session, it is possible that users might make wrong assumption that this value remains stable across version upgrades, when in reality it depends on whether the version has make changes to query tree internals. So to explicitly ensure that users do not make this wrong assumption, hash value generation use statistics session id, which is newly created under situations described above.  

>> Will update documentation
>> clearly explain the term statistics session in this context

>Yep, that's helpful!

Regards,
Sameer



View this message in context: Re: pg_stat_statements: calls under-estimation propagation
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: FW: REVIEW: Allow formatting in log_line_prefix
Следующее
От: KONDO Mitsumasa
Дата:
Сообщение: gaussian distribution pgbench