Re: Partition table in 9.0.x?

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: Partition table in 9.0.x?
Дата
Msg-id CAMkU=1yMkZtme3pB=J8=LrzmqtRDMhOU6mMp1KZAX_VTsdKTbA@mail.gmail.com
обсуждение исходный текст
Ответ на Partition table in 9.0.x?  (AJ Weber <aweber@comcast.net>)
Ответы Re: Partition table in 9.0.x?
Список pgsql-performance
On Friday, January 4, 2013, AJ Weber wrote:
Hi all,

I have a table that has about 73mm rows in it and growing.  

How big is the table in MB?  Its indexes?

...
 
The server has 12GB RAM, 4 cores, but is shared with a big webapp running in Tomcat -- and I only have a RAID1 disk to work on.  Woes me...


By a RAID1 disk, do you mean two disks in a RAID1 configuration, or a single RAID1 composed of an unspecified number of disks?

Often spending many thousands of dollars in DBA time can save you from having to buy many hundreds of dollars in hard drives. :)  On the other hand, often you end up having to buy the extra disks anyway afterall.


 
Anyway, this table is going to continue to grow, and it's used frequently (Read and Write).

Are all rows in the table read and written with equal vigor, or are there hot rows and cold rows that can be recognized based on the row's values?
 
 From what I read, this table is a candidate to be partitioned for performance and scalability.  I have tested some scripts to build the "inherits" tables with their constraints and the trigger/function to perform the work.

Am I doing the right thing by partitioning this?

Probably not.  Or at least, you haven't given us the information to know.  Very broadly speaking, well-implemented partitioning makes bulk loading and removal operations take less IO, but makes normal operations take more IO,  or if lucky leaves it unchanged.  There are exceptions, but unless you can identify a very specific reason to think you might have one of those exceptions, then you probably don't.

Do you have a natural partitioning key?  That is, is there a column (or expression) which occurs as a selective component in the where clause of almost all of your most io consuming SQL and DML?  If so, you might benefit from partitioning on it.  (But in that case, you might be able to get most of the benefits of partitioning, without the headaches of it, just by revamping your indexes to include that column/expression as their leading field).

If you don't have a good candidate partitioning key, then partitioning will almost surely make things worse.

 If so, and I can afford some downtime, is dumping the table via pg_dump and then loading it back in the best way to do this?

To do efficient bulk loading into a partitioned table, you need to specifically target each partition, rather than targeting with a trigger.  That pretty much rules out pg_dump, AFAIK, unless you are going to parse the dump file(s) and rewrite them.


Should I run a cluster or vacuum full after all is done?

Probably not.  If a cluster after the partitioning would be beneficial, there would be a pretty good chance you could do a cluster *instead* of the partitioning and get the same benefit.  

If you do some massive deletes from the parent table as part of populating the children, then a vacuum full of the parent could be useful.  But if you dump the parent table, truncate it, and reload it as partitioned tables, then vacuum full would probably not be useful.

Really, you need to identify your most resource-intensive queries before you can make any reasonable decisions.

 

Is there a major benefit if I can upgrade to 9.2.x in some way that I haven't realized?

If you have specific queries that are misoptimized and so are generating more IO than they need to, then upgrading could help.  On the other hand, it could also make things worse, if a currently well optimized query becomes worse. 

But, instrumentation has improved in 9.2 from 9.0, so upgrading would make it easier to figure out just which queries are really bad and have the most opportunity for improvement.  A little well informed optimization might obviate the need for either partitioning or more hard drives.


Finally, if anyone has any comments about my settings listed above that might help improve performance, I thank you in advance.

Your default statistics target seemed low.  Without knowing the nature of your most resource intensive queries or how much memory tomcat is using, it is hard to say more.

Cheers,

Jeff

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Re[4]: [PERFORM] Re[2]: [PERFORM] SMP on a heavy loaded database
Следующее
От: nobody nowhere
Дата:
Сообщение: Re[2]: [PERFORM] Re[2]: [PERFORM] SMP on a heavy loaded database