Re: [rfc] overhauling pgstat.stat
От | Tomas Vondra |
---|---|
Тема | Re: [rfc] overhauling pgstat.stat |
Дата | |
Msg-id | 522D0603.4040407@fuzzy.cz обсуждение исходный текст |
Ответ на | Re: [rfc] overhauling pgstat.stat (Jeff Janes <jeff.janes@gmail.com>) |
Ответы |
Re: [rfc] overhauling pgstat.stat
(Satoshi Nagayasu <snaga@uptime.jp>)
|
Список | pgsql-hackers |
On 8.9.2013 23:04, Jeff Janes wrote: > On Tue, Sep 3, 2013 at 10:09 PM, Satoshi Nagayasu <snaga@uptime.jp> > wrote: >> Hi, >> >> >> (2013/09/04 13:07), Alvaro Herrera wrote: >>> >>> Satoshi Nagayasu wrote: >>> >>>> As you may know, this file could be handreds of MB in size, >>>> because pgstat.stat holds all access statistics in each >>>> database, and it needs to read/write an entire pgstat.stat >>>> frequently. >>>> >>>> As a result, pgstat.stat often generates massive I/O operation, >>>> particularly when having a large number of tables in the >>>> database. >>> >>> >>> We already changed it: >> >>> >>> commit 187492b6c2e8cafc5b39063ca3b67846e8155d24 Author: Alvaro >>> Herrera <alvherre@alvh.no-ip.org> Date: Mon Feb 18 17:56:08 >>> 2013 -0300 >>> >>> Split pgstat file in smaller pieces >> >> Thanks for the comments. I forgot to mention that. >> >> Yes, we have already split single pgstat.stat file into several >> pieces. >> >> However, we still need to read/write large amount of statistics >> data when we have a large number of tables in single database or >> multiple databases being accessed. Right? > > Do you have a test case for measuring this? I vaguely remember from > when I was testing the split patch, that I thought that after that > improvement the load that was left was so low that there was little > point in optimizing it further. This is actually a pretty good point. Creating a synthetic test case is quite simple - just create 1.000.000 tables in a single database, but I'm wondering if it's actually realistic. Do we have a real-world example where the current "one stat file per db" is not enough? The reason why I worked on the split patch is that our application is slightly crazy and creates a lot of tables (+ indexes) on the fly, and as we have up to a thousand databases on each host, we often ended up with a huge stat file. Splitting the stat file improved that considerably, although that's partially because we have the stats on a tmpfs, so I/O is not a problem, and the CPU overhead is negligible thanks to splitting the stats per database. But AFAIK there are operating systems where creating a filesystem in RAM is not that simple - e.g. Windows. In such cases even a moderate number of objects may be a significant issue I/O-wise. But then again, I can't really think of reasonable a system creating that many objects in a single database (except for e.g. a shared database using schemas instead of databases). Tomas
В списке pgsql-hackers по дате отправления: