Обсуждение: dblink() from GridSQL

Поиск
Список
Период
Сортировка

dblink() from GridSQL

От
Sam Nelson
Дата:
Hi List,

We have a customer who is trying to migrate a few PostgresPlus instances to GridSQL clusters.  They have a process that pulls data from another server using dblink every night, and we're trying to replicate that on the GridSQL instance, but grid is being a bit of a pain.

Grid doesn't seem to allow functions in from statements, and, of course, it spits out errors about functions returning records being called in the wrong context if we just try "select dblink(foo, bar);" (we had to try it).

Has anyone else run into this specific issue?  Is there a known workaround?  Any ideas on what else we should try?
--- 
===========================
Samuel Nelson
Consistent State
www.consistentstate.com
303-955-0509
===========================

Re: dblink() from GridSQL

От
Merlin Moncure
Дата:
On Thu, May 5, 2011 at 12:13 PM, Sam Nelson <samn@consistentstate.com> wrote:
> Hi List,
> We have a customer who is trying to migrate a few PostgresPlus instances to
> GridSQL clusters.  They have a process that pulls data from another server
> using dblink every night, and we're trying to replicate that on the GridSQL
> instance, but grid is being a bit of a pain.
> Grid doesn't seem to allow functions in from statements, and, of course, it
> spits out errors about functions returning records being called in the wrong
> context if we just try "select dblink(foo, bar);" (we had to try it).
> Has anyone else run into this specific issue?  Is there a known workaround?
>  Any ideas on what else we should try?

have you considered wrapping the output of the dblink query in a view?

merlin

Re: dblink() from GridSQL

От
Scott Mead
Дата:

On Thu, May 5, 2011 at 1:40 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Thu, May 5, 2011 at 12:13 PM, Sam Nelson <samn@consistentstate.com> wrote:
> Hi List,
> We have a customer who is trying to migrate a few PostgresPlus instances to
> GridSQL clusters.  They have a process that pulls data from another server
> using dblink every night, and we're trying to replicate that on the GridSQL
> instance, but grid is being a bit of a pain.
> Grid doesn't seem to allow functions in from statements, and, of course, it
> spits out errors about functions returning records being called in the wrong
> context if we just try "select dblink(foo, bar);" (we had to try it).
> Has anyone else run into this specific issue?

GridSQL itself doesn't support functions. 
 
 Is there a known workaround?
>  Any ideas on what else we should try?


You'd have to present the data to be partitioned to the gsql controller for partitioning to happen properly, or use the high-speed import that it comes with.  Could you dump the data to an intermediary csv and then push it at the import utility?

--Scott
 
have you considered wrapping the output of the dblink query in a view?

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: dblink() from GridSQL

От
Sam Nelson
Дата:
Grid passes functions off to underlying databases.  Unfortunately, it doesn't do so when the functions are in the from clause.  If it did, that would work.

But I digress.  We're attempting to try either the csv import (which would require a new script, but no biggie) or a data pull on the underlying database to a table that only exists on one node.
--- 
===========================
Samuel Nelson
Consistent State
www.consistentstate.com
303-955-0509
===========================



On Thu, May 5, 2011 at 12:36 PM, Scott Mead <scottm@openscg.com> wrote:

On Thu, May 5, 2011 at 1:40 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Thu, May 5, 2011 at 12:13 PM, Sam Nelson <samn@consistentstate.com> wrote:
> Hi List,
> We have a customer who is trying to migrate a few PostgresPlus instances to
> GridSQL clusters.  They have a process that pulls data from another server
> using dblink every night, and we're trying to replicate that on the GridSQL
> instance, but grid is being a bit of a pain.
> Grid doesn't seem to allow functions in from statements, and, of course, it
> spits out errors about functions returning records being called in the wrong
> context if we just try "select dblink(foo, bar);" (we had to try it).
> Has anyone else run into this specific issue?

GridSQL itself doesn't support functions. 
 
 Is there a known workaround?
>  Any ideas on what else we should try?


You'd have to present the data to be partitioned to the gsql controller for partitioning to happen properly, or use the high-speed import that it comes with.  Could you dump the data to an intermediary csv and then push it at the import utility?

--Scott
 
have you considered wrapping the output of the dblink query in a view?

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: dblink() from GridSQL

От
Mason S
Дата:

> > Hi List,
> > We have a customer who is trying to migrate a few PostgresPlus instances
> to
> > GridSQL clusters.  They have a process that pulls data from another
> server
> > using dblink every night, and we're trying to replicate that on the
> GridSQL
> > instance, but grid is being a bit of a pain.
> > Grid doesn't seem to allow functions in from statements, and, of course,
> it
> > spits out errors about functions returning records being called in the
> wrong
> > context if we just try "select dblink(foo, bar);" (we had to try it).
> > Has anyone else run into this specific issue?
>

GridSQL itself doesn't support functions.

GridSQL supports most PostgreSQL built-in functions. In terms of other user-defined functions, one can create them directly on the data nodes, then teach the coordinator about them by putting the function signature in the GridSQL config file, and restarting the Coordinator.  (This will not solve the functions in FROM clause issue of course.)

I suspect there will be issues with dblink, however.

I would not recommend using dblink for this anyway. Are there only inserts going on here, or updates and deletes, too? If only adding data, I would just dump out data periodically and then load via COPY to the GridSQL coordinator, as Scott mentions.  It will run much, much faster.

If you really need dblink, and you are clear about the distribution/partitioning, you could run this directly from the underlying nodes, but you can really mess things up if you pull in data to the wrong node. I would avoid this.

Another workaround would be to create a dummy database in PostgreSQL, access the source data via dblink, then insert it into GridSQL.


Mason

 

>  Is there a known workaround?
> >  Any ideas on what else we should try?
>
>
You'd have to present the data to be partitioned to the gsql controller for
partitioning to happen properly, or use the high-speed import that it comes
with.  Could you dump the data to an intermediary csv and then push it at
the import utility?

 

--Scott


> have you considered wrapping the output of the dblink query in a view?
>
> merlin
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>