Re: Declarative partitioning grammar

Поиск
Список
Период
Сортировка
От Gregory Stark
Тема Re: Declarative partitioning grammar
Дата
Msg-id 87zlv7ggil.fsf@oxford.xeocode.com
обсуждение исходный текст
Ответ на Re: Declarative partitioning grammar  (Markus Schiltknecht <markus@bluegap.ch>)
Ответы Re: Declarative partitioning grammar  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers
"Markus Schiltknecht" <markus@bluegap.ch> writes:

> Hi,
>
> Gregory Stark wrote:
>> In a previous life I had a database which had daily partitions. I assure you
>> it was unquestionably the right decision. Each day's data wasn't just
>> distinguished by the timestamp but actually was entirely separate from the
>> previous day's data. Both the archiving strategy and the many reports which
>> were ran all depended specifically on the day the data was collected on.
>
> Wouldn't Segment Exclusion (maybe together with a specialized form of
> CLUSTERing) handle that case much better than partitioning? Without the need to
> name all those thousands of partitions and manage them manually.

Firstly we have a philosophical difference here. I think crossing my fingers
and hoping for a performance boost from some low level infrastructure that I
can't measure or monitor is not better than managing things so I know
precisely what's going on and can see it for myself in the plan.

Secondly, no, it wouldn't handle it better. Consider a use case:

select * from data where user_id = ? and data_set = ?

Where I partition by data_set once a day. Now consider a user which has a
dozen entries every day for the past 10 years. That's 40k records or so. In
the segment exclusion case I have to descend a huge btree index which manages
every record for every user for the past 10 years. (nevermind trying to manage
such an index when it comes time to archive data.) Take the 40k records I pull
from it, and for each one look up the physical location in some kind of
segment endpoint data structure, presumably using a binary search. Once I've
done the binary search for each of the 40k index pointers I can compare the
data_set parameter with the low and high end point of the partition. They'll
all be rejected except the desired partition and I'll then have to go look up
the records for the dozen or so index pointers in that partition.

The alternative is a plan which says to do a normal index lookup on a small
manageable index for a single partition. The index will contain 12 index
pointers which are precisely the records I'm interested in. I look up those 12
records and I'm done.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Markus Schiltknecht
Дата:
Сообщение: Re: Declarative partitioning grammar
Следующее
От: Teodor Sigaev
Дата:
Сообщение: Re: ts_headline() dumps core