Re: Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables

Поиск
Список
Период
Сортировка
От Mischa Sandberg
Тема Re: Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables
Дата
Msg-id AcL1d.26650$XP3.14397@edtnps84
обсуждение исходный текст
Ответ на Re: Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables  ("Simon Riggs" <simon@2ndquadrant.com>)
Список pgsql-performance
Simon Riggs wrote:
>>Jim C. Nasby
>>On Mon, Sep 13, 2004 at 11:07:35PM +0100, Simon Riggs wrote:
>>
>>>PostgreSQL's functionality is in many ways similar to Oracle
>>Partitioning.
>>
>>>Loading up your data in many similar tables, then creating a view like:
>>>
>>>CREATE VIEW BIGTABLE (idate, col1, col2, col3...) AS
>>>SELECT 200409130800, col1, col2, col3... FROM table200409130800
>>>UNION ALL
>>>SELECT 200409131000, col1, col2, col3... FROM table200409131000
>>>UNION ALL
>>>SELECT 200409131200, col1, col2, col3... FROM table200409131200
>>>...etc...
>>>
>>>will allow the PostgreSQL optimizer to eliminate partitions
>>from the query
>>>when you run queries which include a predicate on the
>>partitioning_col, e.g.
>>
>>>select count(*) from bigtable where idate >= 200409131000
>
> The "partitions" are just tables, so no need for other management tools.
> Oracle treats the partitions as sub-tables, so you need a range of commands
> to add, swap etc the partitions of the main table.

A few years ago I wrote a federated query engine (wrapped as an ODBC
driver) that had to handle thousands of contributors (partitions) to a
pseudotable / VIEWofUNIONs. Joins did require some special handling in
the optimizer, because of the huge number of crossproducts between
different tables. It was definitely worth the effort at the time,
because you need different strategies for: joining a partition to
another partition on the same subserver; joining two large partitions on
different servers; and joining a large partition on one server to a
small one on another.

The differences may not be so great for a solitary server;
but they're still there, because of disparity in subtable sizes. The
simplistic query plans tend to let you down, when you're dealing with
honking warehouses.

I'm guessing that Oracle keeps per-subtable AND cross-all-subtables
statistics, rather than building the latter from scratch in the course
of evaluating the query plan. That's the one limitation I see in
emulating their partitioned tables with Views.

В списке pgsql-performance по дате отправления:

Предыдущее
От: Steve Atkins
Дата:
Сообщение: Re: Data Warehouse Reevaluation - MySQL vs Postgres --
Следующее
От: Christopher Kings-Lynne
Дата:
Сообщение: Re: Data Warehouse Reevaluation - MySQL vs Postgres --