Обсуждение: Are we the first OSS database with parallel query?

Поиск
Список
Период
Сортировка

Are we the first OSS database with parallel query?

От
Josh Berkus
Дата:
I suspect not, but I can't think of another example right now.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


Re: Are we the first OSS database with parallel query?

От
James Keener
Дата:
What about things like Apache Drill and EventQL?

On Fri, Aug 26, 2016 at 8:41 PM, Josh Berkus <josh@agliodbs.com> wrote:
I suspect not, but I can't think of another example right now.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


--
Sent via pgsql-advocacy mailing list (pgsql-advocacy@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-advocacy

Re: Are we the first OSS database with parallel query?

От
David Fetter
Дата:
On Fri, Aug 26, 2016 at 08:41:15PM -0400, Josh Berkus wrote:
> I suspect not, but I can't think of another example right now.

There are a fair number that beat us to that punch (GPDB, HadoopDB,
etc.)

We're the first with a liberal license, though. :)

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Are we the first OSS database with parallel query?

От
Josh Berkus
Дата:
On 08/29/2016 11:39 AM, David Fetter wrote:
> On Fri, Aug 26, 2016 at 08:41:15PM -0400, Josh Berkus wrote:
>> I suspect not, but I can't think of another example right now.
>
> There are a fair number that beat us to that punch (GPDB, HadoopDB,
> etc.)

Do they have parallel query on a single node?  I suppose you can have
multiple shards-per-node, but that's still a different feature.

Also, are there other *SQL* implementations?


--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


Re: Are we the first OSS database with parallel query?

От
David Fetter
Дата:
On Mon, Aug 29, 2016 at 11:41:48AM -0700, Josh Berkus wrote:
> On 08/29/2016 11:39 AM, David Fetter wrote:
> > On Fri, Aug 26, 2016 at 08:41:15PM -0400, Josh Berkus wrote:
> >> I suspect not, but I can't think of another example right now.
> >
> > There are a fair number that beat us to that punch (GPDB, HadoopDB,
> > etc.)
>
> Do they have parallel query on a single node?  I suppose you can have
> multiple shards-per-node, but that's still a different feature.

I don't know of one offhand, but that distinction seems like a *VERY*
thin slice to be claiming.

> Also, are there other *SQL* implementations?

Yep.  You can run SQL in parallel atop Hadoop.

Also: https://shardquery.com/2014/02/25/shard-query-supports-background-jobs-query-parallelism-and-all-select-syntax/

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Are we the first OSS database with parallel query?

От
Josh Berkus
Дата:
On 08/29/2016 11:46 AM, David Fetter wrote:
> On Mon, Aug 29, 2016 at 11:41:48AM -0700, Josh Berkus wrote:
>> On 08/29/2016 11:39 AM, David Fetter wrote:
>>> On Fri, Aug 26, 2016 at 08:41:15PM -0400, Josh Berkus wrote:
>>>> I suspect not, but I can't think of another example right now.
>>>
>>> There are a fair number that beat us to that punch (GPDB, HadoopDB,
>>> etc.)
>>
>> Do they have parallel query on a single node?  I suppose you can have
>> multiple shards-per-node, but that's still a different feature.
>
> I don't know of one offhand, but that distinction seems like a *VERY*
> thin slice to be claiming.

Yeah, it's certainly not terribly marketable. "First non-sharded
parallel query".

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


Re: Are we the first OSS database with parallel query?

От
David Fetter
Дата:
On Mon, Aug 29, 2016 at 12:06:43PM -0700, Josh Berkus wrote:
> On 08/29/2016 11:46 AM, David Fetter wrote:
> > On Mon, Aug 29, 2016 at 11:41:48AM -0700, Josh Berkus wrote:
> >> On 08/29/2016 11:39 AM, David Fetter wrote:
> >>> On Fri, Aug 26, 2016 at 08:41:15PM -0400, Josh Berkus wrote:
> >>>> I suspect not, but I can't think of another example right now.
> >>>
> >>> There are a fair number that beat us to that punch (GPDB, HadoopDB,
> >>> etc.)
> >>
> >> Do they have parallel query on a single node?  I suppose you can have
> >> multiple shards-per-node, but that's still a different feature.
> >
> > I don't know of one offhand, but that distinction seems like a *VERY*
> > thin slice to be claiming.
>
> Yeah, it's certainly not terribly marketable. "First non-sharded
> parallel query".

PostgreSQL: we don't charge you a network hop to use another core ;)

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Are we the first OSS database with parallel query?

От
julyanto SUTANDANG
Дата:
I thought that the real term of Parallel Query should talk about Query in a Host, Not multi host. 
When multihost (clusters) is the context, of course it is parallel execution of query in every single host with single query each. So, applying term Parallel Query as the first RDBMS implement it, i think it is correct. 



Julyanto SUTANDANG

Equnix Business Solutions, PT
(An Open Source and Open Mind Company)
www.equnix.co.id
Pusat Niaga ITC Roxy Mas Blok C2/42.  Jl. KH Hasyim Ashari 125, Jakarta Pusat
T: +6221 22866662 F: +62216315281 M: +628164858028


Caution: The information enclosed in this email (and any attachments) may be legally privileged and/or confidential and is intended only for the use of the addressee(s). No addressee should forward, print, copy, or otherwise reproduce this message in any manner that would allow it to be viewed by any individual not originally listed as a recipient. If the reader of this message is not the intended recipient, you are hereby notified that any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is strictly prohibited. If you have received this communication in error, please immediately notify the sender and delete this message.Unless it is made by the authorized person, any views expressed in this message are those of the individual sender and may not necessarily reflect the views of PT Equnix Business Solutions.

On Tue, Aug 30, 2016 at 2:20 AM, David Fetter <david@fetter.org> wrote:
On Mon, Aug 29, 2016 at 12:06:43PM -0700, Josh Berkus wrote:
> On 08/29/2016 11:46 AM, David Fetter wrote:
> > On Mon, Aug 29, 2016 at 11:41:48AM -0700, Josh Berkus wrote:
> >> On 08/29/2016 11:39 AM, David Fetter wrote:
> >>> On Fri, Aug 26, 2016 at 08:41:15PM -0400, Josh Berkus wrote:
> >>>> I suspect not, but I can't think of another example right now.
> >>>
> >>> There are a fair number that beat us to that punch (GPDB, HadoopDB,
> >>> etc.)
> >>
> >> Do they have parallel query on a single node?  I suppose you can have
> >> multiple shards-per-node, but that's still a different feature.
> >
> > I don't know of one offhand, but that distinction seems like a *VERY*
> > thin slice to be claiming.
>
> Yeah, it's certainly not terribly marketable. "First non-sharded
> parallel query".

PostgreSQL: we don't charge you a network hop to use another core ;)

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


--
Sent via pgsql-advocacy mailing list (pgsql-advocacy@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-advocacy

Re: Are we the first OSS database with parallel query?

От
Bruce Momjian
Дата:
On Tue, Aug 30, 2016 at 02:45:22AM +0700, julyanto SUTANDANG wrote:
> I thought that the real term of Parallel Query should talk about Query in a
> Host, Not multi host. 
> When multihost (clusters) is the context, of course it is parallel execution of
> query in every single host with single query each. So, applying term Parallel
> Query as the first RDBMS implement it, i think it is correct. 

I think we added a "parallel CPU query" feature.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +


Re: Are we the first OSS database with parallel query?

От
Tomas Vondra
Дата:
On 08/29/2016 08:39 PM, David Fetter wrote:
> On Fri, Aug 26, 2016 at 08:41:15PM -0400, Josh Berkus wrote:
>> I suspect not, but I can't think of another example right now.
>
> There are a fair number that beat us to that punch (GPDB, HadoopDB,
> etc.)
>
> We're the first with a liberal license, though. :)
>

No, we're not. This is fairly tricky question, because there's
definitely a bunch of databases that are not used widely in production
environments, but are technically open source and implement some sort of
parallel query functionality.

For example there's MonetDB (which is using basically a Mozilla
license), which supports parallel queries since ~2012 or so.

There's also C-Store (Vertica is a commercial fork), and H-Store (VoltDB
is a commercial fork) - AFAIK both support query parallelism. Although
they are experimental / research project (and most companies use the
commercial forks in production), they use BSD license so technically
they are open source. Also, Stonebraker cooperated on both those
projects so neglecting them would be particularly annoying.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Are we the first OSS database with parallel query?

От
"Joshua D. Drake"
Дата:
On 08/29/2016 01:08 PM, Tomas Vondra wrote:
> On 08/29/2016 08:39 PM, David Fetter wrote:
>> On Fri, Aug 26, 2016 at 08:41:15PM -0400, Josh Berkus wrote:
>>> I suspect not, but I can't think of another example right now.
>>
>> There are a fair number that beat us to that punch (GPDB, HadoopDB,
>> etc.)
>>
>> We're the first with a liberal license, though. :)

First relational open source....?

JD


>
>
> regards
>


--
Command Prompt, Inc.                  http://the.postgres.company/
                         +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


Re: Are we the first OSS database with parallel query?

От
Chris Travers
Дата:


On Mon, Aug 29, 2016 at 9:45 PM, julyanto SUTANDANG <julyanto@equnix.co.id> wrote:
I thought that the real term of Parallel Query should talk about Query in a Host, Not multi host. 
When multihost (clusters) is the context, of course it is parallel execution of query in every single host with single query each. So, applying term Parallel Query as the first RDBMS implement it, i think it is correct. 

Right.  With HadoopDB what you basically have is a strange form of SQL* run as a map reduce job.

* Hadoop assumes schema on read instead of schema on write, which means if your data doesn't match your expectations, you may get garbage back or may get nulls back.  This is actually a feature in big data because usually you are looking for heuristic analysis of data based on statistical guesswork rather than provably correct answers.  But I would *not* consider them a competitor.



Julyanto SUTANDANG

Equnix Business Solutions, PT
(An Open Source and Open Mind Company)
www.equnix.co.id
Pusat Niaga ITC Roxy Mas Blok C2/42.  Jl. KH Hasyim Ashari 125, Jakarta Pusat
T: +6221 22866662 F: +62216315281 M: +628164858028


Caution: The information enclosed in this email (and any attachments) may be legally privileged and/or confidential and is intended only for the use of the addressee(s). No addressee should forward, print, copy, or otherwise reproduce this message in any manner that would allow it to be viewed by any individual not originally listed as a recipient. If the reader of this message is not the intended recipient, you are hereby notified that any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is strictly prohibited. If you have received this communication in error, please immediately notify the sender and delete this message.Unless it is made by the authorized person, any views expressed in this message are those of the individual sender and may not necessarily reflect the views of PT Equnix Business Solutions.

On Tue, Aug 30, 2016 at 2:20 AM, David Fetter <david@fetter.org> wrote:
On Mon, Aug 29, 2016 at 12:06:43PM -0700, Josh Berkus wrote:
> On 08/29/2016 11:46 AM, David Fetter wrote:
> > On Mon, Aug 29, 2016 at 11:41:48AM -0700, Josh Berkus wrote:
> >> On 08/29/2016 11:39 AM, David Fetter wrote:
> >>> On Fri, Aug 26, 2016 at 08:41:15PM -0400, Josh Berkus wrote:
> >>>> I suspect not, but I can't think of another example right now.
> >>>
> >>> There are a fair number that beat us to that punch (GPDB, HadoopDB,
> >>> etc.)
> >>
> >> Do they have parallel query on a single node?  I suppose you can have
> >> multiple shards-per-node, but that's still a different feature.
> >
> > I don't know of one offhand, but that distinction seems like a *VERY*
> > thin slice to be claiming.
>
> Yeah, it's certainly not terribly marketable. "First non-sharded
> parallel query".

PostgreSQL: we don't charge you a network hop to use another core ;)

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


--
Sent via pgsql-advocacy mailing list (pgsql-advocacy@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-advocacy




--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor lock-in.