Re: Performance query about large tables, lots of concurrent access

Поиск
Список
Период
Сортировка
От Kurt Overberg
Тема Re: Performance query about large tables, lots of concurrent access
Дата
Msg-id C83222F0-12CC-4224-BBBE-47F9881D39D3@hotdogrecords.com
обсуждение исходный текст
Ответ на Re: Performance query about large tables, lots of concurrent access  (Karl Wright <kwright@metacarta.com>)
Список pgsql-performance
A useful utility that I've found is PgFouine.  It has an option to
analyze VACUUM VERBOSE logs.  It has been instrumental in helping me
figure out whats been going on with my VACUUM that is taking 4+
hours, specifically tracking the tables that are taking the longest.
I highly recommend checking it out.   It would also perhaps be a good
idea rather than simply starting a vacuum every 6 days, set it so
that it starts again as soon as it finishes (using a lock file or
something that is polled for every few hours or minutes).  This way,
a vacuum will kick off right when the other one finishes, hopefully
slowly decreasing in time over time.

Hope this helps...

/kurt


On Jun 19, 2007, at 10:06 AM, Karl Wright wrote:

> Alvaro Herrera wrote:
>> Karl Wright wrote:
>>> Alvaro Herrera wrote:
>>>> Karl Wright wrote:
>>>>
>>>>> This particular run lasted four days before a VACUUM became
>>>>> essential. The symptom that indicates that VACUUM is needed
>>>>> seems to be that the CPU usage of any given postgresql query
>>>>> skyrockets.  Is this essentially correct?
>>>> Are you saying you weren't used to run VACUUM all the time?  If so,
>>>> that's where the problem lies.
>>> Postgresql 7.4 VACUUM runs for so long that starting it with a
>>> cron job even every 24 hours caused multiple instances of VACUUM
>>> to eventually be running in my case.  So I tried to find a VACUUM
>>> schedule that permitted each individual vacuum to finish before
>>> the next one started.  A vacuum seemed to require 4-5 days with
>>> this particular database - or at least it did for 7.4.  So I had
>>> the VACUUM schedule set to run every six days.
>> How large is the database?  I must admit I have never seen a database
>> that took 4 days to vacuum.  This could mean that your database is
>> humongous, or that the vacuum strategy is wrong for some reason.
>
> The database is humongus, and the machine is under intense load.
> On the instance where this long vacuum occurred, there were several
> large tables - one with 7,000,000 rows, one with 14,000,000, one
> with 140,000,000, and one with 250,000,000.
>
>> You know that you can run vacuum on particular tables, right?  It
>> would
>> be probably a good idea to run vacuum on the most updated tables, and
>> leave alone those that are not or little updated (hopefully the
>> biggest;
>> this would mean that an almost-complete vacuum run would take much
>> less
>> than a whole day).
>
> Yeah, sorry, that doesn't apply here.
>
>> Or maybe vacuum was stuck waiting on a lock somewhere.
>>> I will be experimenting with 8.1 to see how long it takes to
>>> complete a vacuum under load conditions tonight.
>> You can also turn autovacuum on in 8.1, which might help quite a bit
>> with finding a good vacuum schedule (you would need a bit of
>> tuning it
>> though, of course).
>> In any case, if you are struggling for performance you are strongly
>> adviced to upgrade to 8.2.
>
> Ok - that's something I should be able to do once we can go to
> debian's etch release.  There's a backport of 8.2 available there.
> (The one for sarge is still considered 'experimental').
>
> Karl
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster


В списке pgsql-performance по дате отправления:

Предыдущее
От: Gregory Stark
Дата:
Сообщение: Re: Performance query about large tables, lots of concurrent access
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Performance query about large tables, lots of concurrent access