Re: general question on two-partition table

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: general question on two-partition table
Дата
Msg-id 20090728005137.GQ23840@tamriel.snowman.net
обсуждение исходный текст
Ответ на general question on two-partition table  (Janet Jacobsen <jsjacobsen@lbl.gov>)
Ответы Re: general question on two-partition table
Список pgsql-general
* Janet Jacobsen (jsjacobsen@lbl.gov) wrote:
> If they are going to spend 95% of their time querying the
> records that meet the 'good' criteria, what are the good
> strategies for ensuring good performance for those queries?
> (1) Should I partition the table into two partitions based on
> the value of rbscore?
> (2) Should I create two separate tables?
>
> Are (1) and (2) more or less equivalent in terms of
> performance?

It's not clear to me what you plan here..  How would you handle (2) for
the users?  Would you construct a view across them, or expect them to
query the right table(s)?  Options, as I see them, and in the order of
'best-to-worst' wrt user friendlyness and performance, I believe, are:

1- Partitioning (with CHECK constraints and constraint_exclusion)
2- View across two tables (with appropriate WHERE clauses)
3- Functional index (as suggested by someone else)
4- separate tables (users have to figure out how to use them)
5- single table with everything

My recommendation would be #1, followed by #2.  Be sure to look up how
to do partitioning by using inheiritance in PG, and, if you need to,
look at how to implement a trigger to handle inserts on the parent
table.  Make sure you create your CHECK() constraints correctly, and
that you have constraint_exclusion enabled, and that it *works*.

> I think that partitioning the table is a more flexible option
> (i.e., what if the cutoff value changes, no need to change
> the name of the table being queried, etc.), but would (2)
> give better performance given that 95% of their queries
> are for rbscores greater than a threshold value?

If you have your partitioning set up correctly, I don't believe having
actual separate tables would be that much of a performance gain for
queries.  It would help some with inserts, tho if you know which table
to insert into, you could just insert into that child rather than the
main and using a trigger.

    Enjoy,

        Stephen

Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Robert James
Дата:
Сообщение: Clients disconnect but query still runs
Следующее
От: Robert James
Дата:
Сообщение: Re: Relational Algebra and Aggregate Functions