Обсуждение: Performance with partitions/inheritance and multiple tables

Поиск
Список
Период
Сортировка

Performance with partitions/inheritance and multiple tables

От
Radhika S
Дата:
Hi,
We currently have a large table (9 million rows) of which only the last couple of days worth of data is queried on a regular basis.
To improve performance we are thinking of partitioning the table.

One idea is:
Current_data = last days worth
archive_data < today (goes back to 2005)

The idea being proposed at work is:
current_data = today's data
prior years data - be broken down into one table per day
archive_data - data older than a year.

My question is:
a) Does postgres suffer a performance hit say if there are 200 child tables.
b) What about aggregation between dates in the last year. eg total sales for firm a  for the last year. It will need to look up n number of tables.

Any ideas, tips, gotchas in implementing partitioning would be welcome. It is a somewhat mission critical (not trading, so not as mission critical) system.

How expensive is maintaining so many partitions both in terms of my writing / maintaining scripts and performance.

Thanks in advance.
Radhika

Re: Performance with partitions/inheritance and multiple tables

От
Shrirang Chitnis
Дата:
Radhika,

If the data is 9 million rows, then I would suggest that you leave it as it is, unless the server configuration and the
numberof users firing queries simultaneously is a matter of concern. 

Try creating indexes on often used fields and use EXPLAIN to speed performance of the queries ... and of course proper
configurationof autovacuum. I have seen query results within a few ms. on similar amount of data on a 2GB RHEL RAID 5
system,so it should not have been an issue. 

HTH,

Shrirang Chitnis
------------------------------------------------------------------------------------------------------

From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Radhika S
Sent: Thursday, December 24, 2009 8:12 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Performance with partitions/inheritance and multiple tables

Hi,
We currently have a large table (9 million rows) of which only the last couple of days worth of data is queried on a
regularbasis. 
To improve performance we are thinking of partitioning the table.

One idea is:
Current_data = last days worth
archive_data < today (goes back to 2005)

The idea being proposed at work is:
current_data = today's data
prior years data - be broken down into one table per day
archive_data - data older than a year.

My question is:
a) Does postgres suffer a performance hit say if there are 200 child tables.
b) What about aggregation between dates in the last year. eg total sales for firm a  for the last year. It will need to
lookup n number of tables. 

Any ideas, tips, gotchas in implementing partitioning would be welcome. It is a somewhat mission critical (not trading,
sonot as mission critical) system. 

How expensive is maintaining so many partitions both in terms of my writing / maintaining scripts and performance.

Thanks in advance.
Radhika

The information contained in this message, including any attachments, is attorney privileged and/or confidential
informationintended only for the use of the individual or entity named as addressee.  The review, dissemination,
distributionor copying of this communication by or to anyone other than the intended addressee is strictly prohibited.
Ifyou have received this communication in error, please immediately notify the sender by replying to the message and
destroyall copies of the original message. 

Re: Performance with partitions/inheritance and multiple tables

От
Anj Adu
Дата:
The recommended partitioning guideline is if your table exceeds 2G

Partitioning benefits:

1. Purging old data very quickly (this is one of the biggest
benefits...especially if you have to purge very often...dont even
think of using DELETE)

2. Performance for certain types of queries where full table scans
benefit from a smaller table size (and hence the smaller partitio will
perform better)

Disadvantages:

You have to maintain scripts to drop/create partitions. Partitions are
not first-class objects in Postgres yet (hopefully in a future
version)

If you are not sure about how large your tables will get...bite the
bullet and partition your data. You will be glad you did so.

On Thu, Dec 24, 2009 at 6:42 AM, Radhika S <rs88820@gmail.com> wrote:
> Hi,
> We currently have a large table (9 million rows) of which only the last
> couple of days worth of data is queried on a regular basis.
> To improve performance we are thinking of partitioning the table.
>
> One idea is:
> Current_data = last days worth
> archive_data < today (goes back to 2005)
>
> The idea being proposed at work is:
> current_data = today's data
> prior years data - be broken down into one table per day
> archive_data - data older than a year.
>
> My question is:
> a) Does postgres suffer a performance hit say if there are 200 child tables.
> b) What about aggregation between dates in the last year. eg total sales for
> firm a  for the last year. It will need to look up n number of tables.
>
> Any ideas, tips, gotchas in implementing partitioning would be welcome. It
> is a somewhat mission critical (not trading, so not as mission critical)
> system.
>
> How expensive is maintaining so many partitions both in terms of my writing
> / maintaining scripts and performance.
>
> Thanks in advance.
> Radhika
>