Re: REINDEX takes half a day (and still not complete!)

Поиск
Список
Период
Сортировка
От Phoenix Kiula
Тема Re: REINDEX takes half a day (and still not complete!)
Дата
Msg-id AANLkTinO2zfueMQfL4x4ue8VS6YD=dU3T=qWnp9xGzLf@mail.gmail.com
обсуждение исходный текст
Ответ на Re: REINDEX takes half a day (and still not complete!)  (Scott Marlowe <scott.marlowe@gmail.com>)
Список pgsql-performance
Thanks Scott.

> What is the output of running vacuum verbose as a superuser (you can
> run it on the postgres database so it returns fast.)


Here's the output for postgres DB:

    INFO:  free space map contains 110614 pages in 33 relations
    DETAIL:  A total of 110464 page slots are in use (including overhead).
    110464 page slots are required to track all free space.
    Current limits are:  950000 page slots, 1500 relations, using 5665 kB.
    VACUUM


Does running it on a postgres database also show the relevant info for
other databases?

From above it seems fine, right?



> also, if vacuum can't keep up you can increase the vacuum cost limit,
> and lower the cost delay.  Anything above 1ms is still quite a wait
> compared to 0.  And most systems don't have the real granularity to go
> that low anyway, so 5ms is about as low as you can go and get a change
> before 0.  Also, if you've got a lot of large relations you might need
> to increase the max workers as well.


I'm not sure I understand this.

(1) I should increase "max workers". But I am on version 8.2.9 -- did
this version have "autovacuum_max_workers"? It seems to be a more
recent thing: http://sn.im/27nxe1

(2) The big table in my database (with 125 million rows) has about
5,000 rows that get DELETEd every day, about 100,000 new INSERTs, and
about 12,000 UPDATEs.

(3) What's that thing about cost delay. Which values from vacuum
should I check to determine the cost delay -- what's the specific
formula?

Thanks!




On Sat, Mar 19, 2011 at 12:58 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Fri, Mar 18, 2011 at 9:07 PM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote:
>> I have a large table but not as large as the kind of numbers that get
>> discussed on this list. It has 125 million rows.
>>
>> REINDEXing the table takes half a day, and it's still not finished.
>>
>> To write this post I did "SELECT COUNT(*)", and here's the output -- so long!
>>
>>    select count(*) from links;
>>       count
>>    -----------
>>     125418191
>>    (1 row)
>>
>>    Time: 1270405.373 ms
>>
>> That's 1270 seconds!
>>
>> I suppose the vaccuum analyze is not doing its job? As you can see
>> from settings below, I have autovacuum set to ON, and there's also a
>> cronjob every 10 hours to do a manual vacuum analyze on this table,
>> which is largest.
>>
>> PG is version 8.2.9.
>>
>> Any thoughts on what I can do to improve performance!?
>>
>> Below are my settings.
>>
>>
>>
>> max_connections              = 300
>> shared_buffers               = 500MB
>> effective_cache_size         = 1GB
>> max_fsm_relations            = 1500
>> max_fsm_pages                = 950000
>>
>> work_mem                     = 100MB
>
> What is the output of running vacuum verbose as a superuser (you can
> run it on the postgres database so it returns fast.)  We're looking
> for the output that looks like this:
>
> INFO:  free space map contains 1930193 pages in 749 relations
> DETAIL:  A total of 1787744 page slots are in use (including overhead).
> 1787744 page slots are required to track all free space.
> Current limits are:  10000000 page slots, 3000 relations, using 58911 kB.
>
> If the space needed exceeds page slots then you need to crank up your
> free space map.  If the relations exceeds the available then you'll
> need to crank up max relations.
>

В списке pgsql-performance по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: Select in subselect vs select = any array
Следующее
От: Adam Tistler
Дата:
Сообщение: Re: Select in subselect vs select = any array