Обсуждение: Parrallel query execution for UNION ALL Queries

Поиск
Список
Период
Сортировка

Parrallel query execution for UNION ALL Queries

От
"Benjamin Arai"
Дата:
Hi,

If I have a query such as:

SELECT * FROM (SELECT * FROM A) UNION ALL (SELECT * FROM B) WHERE
blah='food';

Assuming the table A and B both have the same attributes and the data
between the table is not partitioned in any special way, does Postgresql
execute WHERE blah="food" on both table simultaiously or what?  If not, is
there a way to execute the query on both in parrallel then aggregate the
results?

To give some context, I have a very large amount of new data being loaded
each week.  Currently I am partitioning the data into a new table every
month which is working great from a indexing standpoint.  But I want to
parrallelize searches if possible to reduce the perofrmance loss of having
multiple tables.

Benjamin


Re: [PERFORM] Parrallel query execution for UNION ALL Queries

От
"Jonah H. Harris"
Дата:
On 7/18/07, Benjamin Arai <me@benjaminarai.com> wrote:
> But I want to parrallelize searches if possible to reduce
> the perofrmance loss of having multiple tables.

PostgreSQL does not support parallel query.  Parallel query on top of
PostgreSQL is provided by ExtenDB and PGPool-II.

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation            | fax: 732.331.1301
33 Wood Ave S, 3rd Floor            | jharris@enterprisedb.com
Iselin, New Jersey 08830            | http://www.enterprisedb.com/

Re: [PERFORM] Parrallel query execution for UNION ALL Queries

От
"Scott Marlowe"
Дата:
On 7/18/07, Benjamin Arai <me@benjaminarai.com> wrote:
> Hi,
>
> If I have a query such as:
>
> SELECT * FROM (SELECT * FROM A) UNION ALL (SELECT * FROM B) WHERE
> blah='food';
>
> Assuming the table A and B both have the same attributes and the data
> between the table is not partitioned in any special way, does Postgresql
> execute WHERE blah="food" on both table simultaiously or what?  If not, is
> there a way to execute the query on both in parrallel then aggregate the
> results?
>
> To give some context, I have a very large amount of new data being loaded
> each week.  Currently I am partitioning the data into a new table every
> month which is working great from a indexing standpoint.  But I want to
> parrallelize searches if possible to reduce the perofrmance loss of having
> multiple tables.

Most of the time, the real issue would be the I/O throughput for such
queries, not the CPU capability.

If you have only one disk for your data storage, you're likely to get
WORSE performance if you have two queries running at once, since the
heads would not be going back and forth from one data set to the
other.

EnterpriseDB, a commercially enhanced version of PostgreSQL can do
query parallelization, but it comes at a cost, and that cost is making
sure you have enough spindles / I/O bandwidth that you won't be
actually slowing your system down.

Re: [PERFORM] Parrallel query execution for UNION ALL Queries

От
"Jim C. Nasby"
Дата:
On Wed, Jul 18, 2007 at 11:30:48AM -0500, Scott Marlowe wrote:
> EnterpriseDB, a commercially enhanced version of PostgreSQL can do
> query parallelization, but it comes at a cost, and that cost is making
> sure you have enough spindles / I/O bandwidth that you won't be
> actually slowing your system down.

I think you're thinking ExtendDB. :)
--
Jim Nasby                                      decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Вложения

Re: [PERFORM] Parrallel query execution for UNION ALL Queries

От
Dimitri Fontaine
Дата:
Hi,

Le mercredi 18 juillet 2007, Jonah H. Harris a écrit :
> On 7/18/07, Benjamin Arai <me@benjaminarai.com> wrote:
> > But I want to parrallelize searches if possible to reduce
> > the perofrmance loss of having multiple tables.
>
> PostgreSQL does not support parallel query.  Parallel query on top of
> PostgreSQL is provided by ExtenDB and PGPool-II.

Seems to me that :
 - GreenPlum provides some commercial parallel query engine on top of
   PostgreSQL,

 - plproxy could be a solution to the given problem.
   https://developer.skype.com/SkypeGarage/DbProjects/PlProxy

Hope this helps,
--
dim

Вложения

Re: [PERFORM] Parrallel query execution for UNION ALL Queries

От
"Luke Lonergan"
Дата:
Dimitri,

> Seems to me that :
>  - GreenPlum provides some commercial parallel query engine on top of
>    PostgreSQL,

I certainly think so and so do our customers in production with 100s of
terabytes :-)

>  - plproxy could be a solution to the given problem.
>    https://developer.skype.com/SkypeGarage/DbProjects/PlProxy

This is solving real world problems at Skype of a different kind than
Greenplum, well worth checking out.

- Luke


Re: Parrallel query execution for UNION ALL Queries

От
llonergan@greenplum.com
Дата:
On Jul 18, 11:50 am, deci...@decibel.org ("Jim C. Nasby") wrote:
> On Wed, Jul 18, 2007 at 11:30:48AM -0500, Scott Marlowe wrote:
> > EnterpriseDB, a commercially enhanced version of PostgreSQL can do
> > query parallelization, but it comes at a cost, and that cost is making
> > sure you have enough spindles / I/O bandwidth that you won't be
> > actually slowing your system down.
>
> I think you're thinking ExtendDB. :)

Well, now they are one and the same - seems that EnterpriseDB bought
ExtenDB and are calling it GridSQL.

Now that it's a commercial endeavor competing with Greenplum, Netezza
and Teradata I'd be very interested in some real world examples of
ExtenDB/GridSQL.

- Luke