Re: Performance query about large tables, lots of concurrent access

Поиск
Список
Период
Сортировка
От Bill Moran
Тема Re: Performance query about large tables, lots of concurrent access
Дата
Msg-id 20070619101504.e456ebef.wmoran@collaborativefusion.com
обсуждение исходный текст
Ответ на Re: Performance query about large tables, lots of concurrent access  (Karl Wright <kwright@metacarta.com>)
Ответы Re: Performance query about large tables, lots of concurrent access  (Karl Wright <kwright@metacarta.com>)
Список pgsql-performance
In response to Karl Wright <kwright@metacarta.com>:

> Alvaro Herrera wrote:
> > Karl Wright wrote:
> >> Alvaro Herrera wrote:
> >>> Karl Wright wrote:
> >>>
> >>>> This particular run lasted four days before a VACUUM became essential.
> >>>> The symptom that indicates that VACUUM is needed seems to be that the
> >>>> CPU usage of any given postgresql query skyrockets.  Is this essentially
> >>>> correct?
> >>> Are you saying you weren't used to run VACUUM all the time?  If so,
> >>> that's where the problem lies.
> >> Postgresql 7.4 VACUUM runs for so long that starting it with a cron job
> >> even every 24 hours caused multiple instances of VACUUM to eventually be
> >> running in my case.  So I tried to find a VACUUM schedule that permitted
> >> each individual vacuum to finish before the next one started.  A vacuum
> >> seemed to require 4-5 days with this particular database - or at least
> >> it did for 7.4.  So I had the VACUUM schedule set to run every six days.
> >
> > How large is the database?  I must admit I have never seen a database
> > that took 4 days to vacuum.  This could mean that your database is
> > humongous, or that the vacuum strategy is wrong for some reason.
>
> The database is humongus, and the machine is under intense load.  On the
> instance where this long vacuum occurred, there were several large
> tables - one with 7,000,000 rows, one with 14,000,000, one with
> 140,000,000, and one with 250,000,000.

Don't rule out the possibility that the only way to fix this _might_ be to
throw more hardware at it.  Proper configuration can buy you a lot, but if
your usage is exceeding the available bandwidth of the IO subsystem, the
only way you're going to get better performance is to put in a faster IO
subsystem.

> > You know that you can run vacuum on particular tables, right?  It would
> > be probably a good idea to run vacuum on the most updated tables, and
> > leave alone those that are not or little updated (hopefully the biggest;
> > this would mean that an almost-complete vacuum run would take much less
> > than a whole day).
>
> Yeah, sorry, that doesn't apply here.

Why not?  I see no reason why an appropriate autovaccum schedule would not
apply to your scenario.  I'm not saying it does, only that your response
does not indicate that it doesn't, and thus I'm concerned that you're
writing autovacuum off without proper research.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

В списке pgsql-performance по дате отправления:

Предыдущее
От: Richard Huxton
Дата:
Сообщение: Re: Maintenance question / DB size anomaly...
Следующее
От: Karl Wright
Дата:
Сообщение: Re: Performance query about large tables, lots of concurrent access