Re: Partitions implementation with views

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: Partitions implementation with views
Дата
Msg-id 878ylwecp8.fsf@stark.dyndns.tv
обсуждение исходный текст
Ответ на Partitions implementation with views  (Jonathan Gardner <jgardner@jonathangardner.net>)
Ответы Re: Partitions implementation with views  (Hannu Krosing <hannu@tm.ee>)
Список pgsql-hackers
Jonathan Gardner <jgardner@jonathangardner.net> writes:

> Usually, it is data that is related to each other somehow. I guess an 
> example would be for an internet host who wants to provide a common 
> shopping cart functionality for all of its customers, but put their 
> specific data on a specific partition for ease of management and query 
> speed. They can't put the data into seperate databases because they also 
> need to examine the complete data set occasionally.

The most commonly cited example is date-based partitions. Separating financial
records by fiscal year for example. So for example the current year may be on
the super-fast raid 0+1 15k rpm SCSI disks, but the previous 7 years may be on
your slow but big raid5 farm of IDE drives. And at year-end you want to create
a new year, drop all the hundreds of gigabytes of data from the 7th year from
the archive quickly, and move the current year to the archive.

> The common response is "Use partial indexes". But I imagine that they want 
> the ability to move partitions onto seperate OS partitions (hence the name, 
> "partition").


Separate OS partitions is a reasonable use of partitioned tables, but the
biggest advantage is being able to drop and load partitions very quickly, and
without impacting performance at all. loading or dropping millions of records
becomes a simple DDL operation instead of a huge DML operation.

...

> Now that view is the partitioned table for all intents and purposes. The 
> partition tables are the partitions themselves.
> 
> Is this what they are looking for, or is it something completely different?

That's the idea, though the purpose of having it as a built-in feature is to
hide all the details you're describing from the user. The user can already do
all the above if they wanted to.

And there's more work to do:

1) The optimizer needs to know about the partitions to be able to check the  query to see if it needs all the
partitionsor only a small subset. Often  partitioned tables are used when most queries only need a single partition
whichdrastically affects the costs for plans.
 

2) You want to be able to manage indexes across all the partitions in a single  operation. Any index on the table where
theleading columns of the index  are the partition key can automatically create a separate index on every  table.
 

3) You want DDL commands to move around the partitions in various ways.  Creating new partitions, moving tables into
thepartitioned table and  moving partitions out into separate tables of their own. splitting a  partition into two
partitions,merging two into one, etc.
 

4) I always managed to avoid them, and they don't seem very useful to me, but  Oracle also supports "global indexes"
whichare indexes that span all the  partitions without having the partition key as the leading columns.
 


-- 
greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephan Szabo
Дата:
Сообщение: Re: Max number of rows in a table
Следующее
От: Greg Stark
Дата:
Сообщение: Re: initdb should create a warning message [was Re: