Re: What needs to be done for real Partitioning?
От | Stacy White |
---|---|
Тема | Re: What needs to be done for real Partitioning? |
Дата | |
Msg-id | 002a01c52d87$543d1010$0200a8c0@grownups обсуждение исходный текст |
Ответ на | What needs to be done for real Partitioning? (Josh Berkus <josh@agliodbs.com>) |
Ответы |
Re: What needs to be done for real Partitioning?
|
Список | pgsql-performance |
From: "Tom Lane" <tgl@sss.pgh.pa.us> > Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > > We probably also need multi-table indexes. > As Josh says, that seems antithetical to the main point of partitioning, > which is to be able to rapidly remove (and add) partitions of a table. > If you have to do index cleaning before you can drop a partition, what's > the point of partitioning? Global indexes (as opposed to partition local indexes) are useful in cases where you have a large number of partitions, index columns different than the partition key, and index values that limit the query to just a subset of the partitions. The two domains that I'm most familiar with are warehouse management, and the film industry. In both these cases it's logical to partition on day/week/month, it's frequently important to keep a lot of history, and it's common to have products that only show activity for a few months. In one of our production systems we have 800 partitions (by week, with a lot of history), but a popular product might have only 20 weeks worth of activity. Selecting records for the product requires at least 800 random-access reads if you have local indexes on 'product_no', 780 of which just tell the executor that the partition doesn't include any information on the product. This is definitely a phase II item, but as I said before it's worth considering since good DBAs can do a lot with global indexes. FWIW, we see large benefits from partitioning other than the ability to easily drop data, for example: - We can vacuum only the active portions of a table - Postgres automatically keeps related records clustered together on disk, which makes it more likely that the blocks used by common queries can be found in cache - The query engine uses full table scans on the relevant sections of data, and quickly skips over the irrelevant sections - 'CLUSTER'ing a single partition is likely to be significantly more performant than clustering a large table In fact, we have yet to drop a partition on any of our Oracle or Postgres production systems.
В списке pgsql-performance по дате отправления: