Обсуждение: Performance with partitions/inheritance and multiple tables
Hi,
We currently have a large table (9 million rows) of which only the last couple of days worth of data is queried on a regular basis.
To improve performance we are thinking of partitioning the table.
One idea is:
Current_data = last days worth
archive_data < today (goes back to 2005)
The idea being proposed at work is:
current_data = today's data
prior years data - be broken down into one table per day
archive_data - data older than a year.
My question is:
a) Does postgres suffer a performance hit say if there are 200 child tables.
b) What about aggregation between dates in the last year. eg total sales for firm a for the last year. It will need to look up n number of tables.
Any ideas, tips, gotchas in implementing partitioning would be welcome. It is a somewhat mission critical (not trading, so not as mission critical) system.
How expensive is maintaining so many partitions both in terms of my writing / maintaining scripts and performance.
Thanks in advance.
Radhika
We currently have a large table (9 million rows) of which only the last couple of days worth of data is queried on a regular basis.
To improve performance we are thinking of partitioning the table.
One idea is:
Current_data = last days worth
archive_data < today (goes back to 2005)
The idea being proposed at work is:
current_data = today's data
prior years data - be broken down into one table per day
archive_data - data older than a year.
My question is:
a) Does postgres suffer a performance hit say if there are 200 child tables.
b) What about aggregation between dates in the last year. eg total sales for firm a for the last year. It will need to look up n number of tables.
Any ideas, tips, gotchas in implementing partitioning would be welcome. It is a somewhat mission critical (not trading, so not as mission critical) system.
How expensive is maintaining so many partitions both in terms of my writing / maintaining scripts and performance.
Thanks in advance.
Radhika
Radhika, If the data is 9 million rows, then I would suggest that you leave it as it is, unless the server configuration and the numberof users firing queries simultaneously is a matter of concern. Try creating indexes on often used fields and use EXPLAIN to speed performance of the queries ... and of course proper configurationof autovacuum. I have seen query results within a few ms. on similar amount of data on a 2GB RHEL RAID 5 system,so it should not have been an issue. HTH, Shrirang Chitnis ------------------------------------------------------------------------------------------------------ From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Radhika S Sent: Thursday, December 24, 2009 8:12 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Performance with partitions/inheritance and multiple tables Hi, We currently have a large table (9 million rows) of which only the last couple of days worth of data is queried on a regularbasis. To improve performance we are thinking of partitioning the table. One idea is: Current_data = last days worth archive_data < today (goes back to 2005) The idea being proposed at work is: current_data = today's data prior years data - be broken down into one table per day archive_data - data older than a year. My question is: a) Does postgres suffer a performance hit say if there are 200 child tables. b) What about aggregation between dates in the last year. eg total sales for firm a for the last year. It will need to lookup n number of tables. Any ideas, tips, gotchas in implementing partitioning would be welcome. It is a somewhat mission critical (not trading, sonot as mission critical) system. How expensive is maintaining so many partitions both in terms of my writing / maintaining scripts and performance. Thanks in advance. Radhika The information contained in this message, including any attachments, is attorney privileged and/or confidential informationintended only for the use of the individual or entity named as addressee. The review, dissemination, distributionor copying of this communication by or to anyone other than the intended addressee is strictly prohibited. Ifyou have received this communication in error, please immediately notify the sender by replying to the message and destroyall copies of the original message.
The recommended partitioning guideline is if your table exceeds 2G Partitioning benefits: 1. Purging old data very quickly (this is one of the biggest benefits...especially if you have to purge very often...dont even think of using DELETE) 2. Performance for certain types of queries where full table scans benefit from a smaller table size (and hence the smaller partitio will perform better) Disadvantages: You have to maintain scripts to drop/create partitions. Partitions are not first-class objects in Postgres yet (hopefully in a future version) If you are not sure about how large your tables will get...bite the bullet and partition your data. You will be glad you did so. On Thu, Dec 24, 2009 at 6:42 AM, Radhika S <rs88820@gmail.com> wrote: > Hi, > We currently have a large table (9 million rows) of which only the last > couple of days worth of data is queried on a regular basis. > To improve performance we are thinking of partitioning the table. > > One idea is: > Current_data = last days worth > archive_data < today (goes back to 2005) > > The idea being proposed at work is: > current_data = today's data > prior years data - be broken down into one table per day > archive_data - data older than a year. > > My question is: > a) Does postgres suffer a performance hit say if there are 200 child tables. > b) What about aggregation between dates in the last year. eg total sales for > firm a for the last year. It will need to look up n number of tables. > > Any ideas, tips, gotchas in implementing partitioning would be welcome. It > is a somewhat mission critical (not trading, so not as mission critical) > system. > > How expensive is maintaining so many partitions both in terms of my writing > / maintaining scripts and performance. > > Thanks in advance. > Radhika >