Re: Add min and max execute statement time in pg_stat_statement

Поиск
Список
Период
Сортировка
От Mitsumasa KONDO
Тема Re: Add min and max execute statement time in pg_stat_statement
Дата
Msg-id CADupcHUKHUua4Tbqi6QcAjsDrns4V6b2cw3HAepgwvB1o4e1JQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Add min and max execute statement time in pg_stat_statement  (Peter Geoghegan <pg@heroku.com>)
Ответы Re: Add min and max execute statement time in pg_stat_statement  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
2014-01-31 Peter Geoghegan <pg@heroku.com>
On Thu, Jan 30, 2014 at 12:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In reality, actual applications
>> could hardly be further from the perfectly uniform distribution of
>> distinct queries presented here.
>
> Yeah, I made the same point in different words.  I think any realistic
> comparison of this code to what we had before needs to measure a workload
> with a more plausible query frequency distribution.

Even though that distribution just doesn't square with anybody's
reality, you can still increase the pg_stat_statements.max setting to
10k and the problem goes away at little cost (a lower setting is
better, but a setting high enough to cache everything is best). But
you're not going to have terribly much use for pg_stat_statements
anyway....if you really do experience churn at that rate with 5,000
possible entries, the module is ipso facto useless, and should be
disabled.
I run extra test your and my patch with the pg_stat_statements.max setting=10k
in other same setting and servers. They are faster than past results.

method     |  try1   |  try2   |   try3
--------------------------------------------
peter 3      | 6.769 | 6.784 | 6.785
method 5  | 6.770 | 6.774 | 6.761


I think that most significant overhead in pg_stat_statements is deleting
and inserting cost in hash table update, and not at LWLocks. If LWLock
is the most overhead, we can see the overhead -S pgbench, because it have 
one select pet tern which are most Lock conflict case. But we can't see such result. 
I'm not sure about dynahash.c, but we can see hash conflict case in this code.
IMHO, I think It might heavy because it have to run list search and compare one
until not conflict it.

And past result shows that your patch's most weak point is that deleting most old statement
and inserting new old statement cost is very high, as you know. It accelerate to affect
update(delete and insert) cost in pg_stat_statements table. So you proposed new setting
10k in default max value. But it is not essential solution, because it is also good perfomance
 for old pg_stat_statements. And when we set max=10K in your patch and want to get most
used only 1000 queries in pg_stat_statements, we have to use order-by-query with limit 1000.
Sort cost is relatively high, so monitoring query will be slow and high cost. But old one is only set
pg_stat_statements.max=1000, and performance is not relatively bad. It will be best settings for getting
most used 1000 queries infomation. 


That' all my assumption.

Sorry for a few extra test, I had no time in my office today.
If we hope, I can run 1/N distribution pgbench test next week,  I modify my perl script little bit,
for creating multiple sql files with various sleep time.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Christian Kruse
Дата:
Сообщение: Re: Patch: show xid and xmin in pg_stat_activity and pg_stat_replication
Следующее
От: "MauMau"
Дата:
Сообщение: Re: [bug fix] pg_ctl fails with config-only directory