Re: vacuum, performance, and MVCC

Поиск
Список
Период
Сортировка
От Csaba Nagy
Тема Re: vacuum, performance, and MVCC
Дата
Msg-id 1151063635.3309.121.camel@coppola.muc.ecircle.de
обсуждение исходный текст
Ответ на Re: vacuum, performance, and MVCC  ("Mark Woodward" <pgsql@mohawksoft.com>)
Ответы Re: vacuum, performance, and MVCC  ("Mark Woodward" <pgsql@mohawksoft.com>)
Список pgsql-hackers
> Let me ask a question, you have this hundred million row table. OK, how
> much of that table is "read/write?" Would it be posible to divide the
> table into two (or more) tables where one is basically static, only
> infrequent inserts and deletes, and the other is highly updated?

Well, all of it is read write... some of the data might be updated less
frequently, but there's no way I would know which part of the data is
that. Logically is just the same type of data... so unless I find a way
to continuously move back and forth the data between an archive table
and the live table, based on how active the groups are, I can't imagine
any other way of partitioning it. And that would also mean some quite
big load given the pretty high dynamics of the groups.

> The "big" thing in performance is the amount of disk I/O, if you have a
> smaller active table with only a single index, then you may be able to cut
> your disk I/O time really down. The smaller the row size, the more rows
> fit into a block. The fewer blocks the less dissk I/O. The less disk I/O
> the bbetter the performance.

I agree, but it is quite hard to achieve that when the data set is both
big AND the partitioning criteria is highly dynamic. Not to mention that
deleting from that table is also a PITA performance-wise, so I wonder
how well the continuous back and forth between the active and inactive
table would do.

> Also, and anyone listening correct me if I'm wrong, you NEED to vacuum
> frequently because the indexes grow and vacuuming them doesnt remove
> everything, sometimes a REINDEX or a drop/recreate is the only way to get
> performance back. So if you wait too long between vacuums, your indexes
> grow  and spread across more disk blocks than they should and thus use
> more disk I/O to search and/or shared memory to cache.

This is nice in theory, but kills performance. I vacuum the big tables
only overnight, otherwise the server is sluggish.

Cheers,
Csaba.




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: xlog viewer proposal
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: [CORE] GPL Source and Copyright Questions