Обсуждение: postgres external table

Поиск
Список
Период
Сортировка

postgres external table

От
Amy Smith
Дата:
all
is there a external table create method ( similar to oracle external table ) ? where to find the information ?
thanks
Amy

Re: postgres external table

От
Craig Ringer
Дата:
Amy Smith wrote:
> all
> is there a external table create method ( similar to oracle external
> table ) ? where to find the information ?

For those non-Oracle users among us, what's an external table? What are
you trying to achieve?

Random guess: you might be looking for tablespaces.

--
Craig Ringer

Re: postgres external table

От
David Fetter
Дата:
On Sun, Jan 17, 2010 at 07:27:34PM -0800, Amy Smith wrote:
> all
> is there a external table create method ( similar to oracle external table )
> ? where to find the information ?

There is a project on pgfoundry which has had some activity lately
that's similar.  You might also try DBI-Link.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: postgres external table

От
Greg Smith
Дата:
Craig Ringer wrote:
> For those non-Oracle users among us, what's an external table?

External tables let you map a text file directly to a table without
explicitly loading it.  In PostgreSQL, if you have data in a CSV file,
usually you'd import it with COPY before you'd use it.  If external
tables were available, you'd just say there's an external table as a CSV
file and you could start running queries against it.

So the quick answer is no, there is no built-in external table support
in PostgreSQL; normally people load the data using COPY instead.
There's a work in progress on this topic at
http://pgfoundry.org/projects/pgexternaltable/ , but it's extremely
rough at this point and I'm not sure if it's even moving in the right
direction--the main project would do something like this via SQL/MED,
and I don't think that's how the prototype is being built at all.

The only PostgreSQL-based product I'm aware of that has working external
table support already is Greenplum DB.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com


Re: postgres external table

От
Tom Lane
Дата:
Greg Smith <greg@2ndquadrant.com> writes:
> Craig Ringer wrote:
>> For those non-Oracle users among us, what's an external table?

> External tables let you map a text file directly to a table without
> explicitly loading it.  In PostgreSQL, if you have data in a CSV file,
> usually you'd import it with COPY before you'd use it.  If external
> tables were available, you'd just say there's an external table as a CSV
> file and you could start running queries against it.

I'm finding it hard to visualize a use-case for that.  We must postulate
that the table is so big that you don't want to import it, and yet you
don't feel a need to have any index on it.  Which among other things
implies that every query will seqscan the whole table.  Where's the
savings?

            regards, tom lane

Re: postgres external table

От
Sam Mason
Дата:
On Mon, Jan 18, 2010 at 09:57:02AM -0500, Tom Lane wrote:
> Greg Smith <greg@2ndquadrant.com> writes:
> > Craig Ringer wrote:
> >> For those non-Oracle users among us, what's an external table?
>
> > External tables let you map a text file directly to a table without
> > explicitly loading it.  In PostgreSQL, if you have data in a CSV file,
> > usually you'd import it with COPY before you'd use it.  If external
> > tables were available, you'd just say there's an external table as a CSV
> > file and you could start running queries against it.
>
> I'm finding it hard to visualize a use-case for that.  We must postulate
> that the table is so big that you don't want to import it, and yet you
> don't feel a need to have any index on it.  Which among other things
> implies that every query will seqscan the whole table.  Where's the
> savings?

I've mostly wanted something like it when I've been dealing with
externally maintained data.  The best idea I've seen so far has just
been a function similar to:

  copy_csv_from(filename text) returns setof text[]

to be used as:

  SELECT d[0]::Int AS id, d[1] AS name
  FROM copy_csv_from('/tmp/usernames.csv') d;

This could be wrapped in a VIEW giving what I'd expect to be similar
semantics to an "external table", however I've never used one so I could
be missing something.

It's possible to write this function at the moment, it's somewhat
suboptimal as the csv file is completely imported before anything else
happens so is only good for small files.

--
  Sam  http://samason.me.uk/

Re: postgres external table

От
Greg Stark
Дата:
On Mon, Jan 18, 2010 at 2:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm finding it hard to visualize a use-case for that.  We must postulate
> that the table is so big that you don't want to import it, and yet you
> don't feel a need to have any index on it.  Which among other things
> implies that every query will seqscan the whole table.  Where's the
> savings?

I think it's usually more "my data is updated by other tools and it
would be hard/impossible/annoying to insert another step into the
pipeline to copy it to yet another place". The main benefit is that
you can access the authoritative data directly without having to copy
it and have some sort of process in place to do that regularly.

Text files are kind of useless but they're a baseline bit of
functionality on top of which to add more sophisticated external forms
such as data available over at some url or over some kind of rpc -- to
which various conditions could be pushed using external indexes -- or
ultimately in another database to which whole joins can be pushed.


--
greg

Re: postgres external table

От
Scott Marlowe
Дата:
On Mon, Jan 18, 2010 at 7:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Smith <greg@2ndquadrant.com> writes:
>> Craig Ringer wrote:
>>> For those non-Oracle users among us, what's an external table?
>
>> External tables let you map a text file directly to a table without
>> explicitly loading it.  In PostgreSQL, if you have data in a CSV file,
>> usually you'd import it with COPY before you'd use it.  If external
>> tables were available, you'd just say there's an external table as a CSV
>> file and you could start running queries against it.
>
> I'm finding it hard to visualize a use-case for that.  We must postulate
> that the table is so big that you don't want to import it, and yet you
> don't feel a need to have any index on it.  Which among other things
> implies that every query will seqscan the whole table.  Where's the
> savings?

I've used it mostly for importing in the past.  Saves the step of
loading a large file into a table with no constraints as a middle
step.

Re: postgres external table

От
Greg Smith
Дата:
Tom Lane wrote:
> I'm finding it hard to visualize a use-case for that.  We must postulate
> that the table is so big that you don't want to import it, and yet you
> don't feel a need to have any index on it.  Which among other things
> implies that every query will seqscan the whole table.  Where's the
> savings?
>

I've mainly seen it used for data loading where there's some sort of
transformation going on, typically to cleanup junk fields that would
fail a constraint or derive new columns.  If you have external tables,
there's no need to load the data into a temporary table if all you're
going to do is modify a few things and then write the result to
somewhere else.  Most of these use cases process the whole file anyway,
so having to do a whole scan isn't an issue.  I used to run an app that
imported gigabytes a day of text files dumped from another server that
used a weird date format I had to process via pl/pgsql function.  Having
to pass them through COPY and then INSERT processed versions to
somewhere else was really a drag, given that there was no use for the
intermediate data.

It also can be handy for bootstrapping apps that are converting stuff
out of a legacy system too.  Just make the mainframe/whatever dump a new
text file periodically into where the external table looks for its data,
and you skip having to schedule reloads when the content changes.  Can
make your life easier while running the two systems in parallel initially.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com


Re: postgres external table

От
Vincenzo Romano
Дата:
2010/1/18 Tom Lane <tgl@sss.pgh.pa.us>:
> Greg Smith <greg@2ndquadrant.com> writes:
>> Craig Ringer wrote:
>>> For those non-Oracle users among us, what's an external table?
>
>> External tables let you map a text file directly to a table without
>> explicitly loading it.  In PostgreSQL, if you have data in a CSV file,
>> usually you'd import it with COPY before you'd use it.  If external
>> tables were available, you'd just say there's an external table as a CSV
>> file and you could start running queries against it.
>
> I'm finding it hard to visualize a use-case for that.  We must postulate
> that the table is so big that you don't want to import it, and yet you
> don't feel a need to have any index on it.  Which among other things
> implies that every query will seqscan the whole table.  Where's the
> savings?
>
>                        regards, tom lane
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Another case, Tom, could be when the file is updated from a non-DB
application and you need to synchronize
the data with other DB applications ...

--
Vincenzo Romano
NotOrAnd Information Technologies
NON QVIETIS MARIBVS NAVTA PERITVS

Re: postgres external table

От
Craig Ringer
Дата:
On 19/01/2010 1:13 AM, Vincenzo Romano wrote:

> Another case, Tom, could be when the file is updated from a non-DB
> application and you need to synchronize
> the data with other DB applications ...

How can that work without a transactional file system, though? If the
external process writes to the file while you're half-way through
reading it, what's the database to do? In general, how do external
tables cope with the fact that they're on non-transactional storage?

--
Craig Ringer

Re: postgres external table

От
Jayadevan M
Дата:
Having 'external tables' lets us avoid the step of loading data from a file into the table. We do not have to check whether a load job has run successfully, whether the data in the table really corresponds to the data in the file etc. It also lets us decide how many rejects can be allowed and so forth.
http://www.adp-gmbh.ch/ora/misc/ext_table.html
Other than that, I have not found any advantage.
Jayadevan




From:        Tom Lane <tgl@sss.pgh.pa.us>
To:        Greg Smith <greg@2ndquadrant.com>
Cc:        Craig Ringer <craig@postnewspapers.com.au>, Amy Smith <vah123@gmail.com>, pgsql-general@postgresql.org
Date:        01/18/2010 08:25 PM
Subject:        Re: [GENERAL] postgres external table
Sent by:        pgsql-general-owner@postgresql.org




Greg Smith <greg@2ndquadrant.com> writes:
> Craig Ringer wrote:
>> For those non-Oracle users among us, what's an external table?

> External tables let you map a text file directly to a table without
> explicitly loading it.  In PostgreSQL, if you have data in a CSV file,
> usually you'd import it with COPY before you'd use it.  If external
> tables were available, you'd just say there's an external table as a CSV
> file and you could start running queries against it.

I'm finding it hard to visualize a use-case for that.  We must postulate
that the table is so big that you don't want to import it, and yet you
don't feel a need to have any index on it.  Which among other things
implies that every query will seqscan the whole table.  Where's the
savings?

                                                  regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general






DISCLAIMER:


"The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."





Re: postgres external table

От
Greg Stark
Дата:
On Tue, Jan 19, 2010 at 4:41 AM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
> How can that work without a transactional file system, though? If the
> external process writes to the file while you're half-way through reading
> it, what's the database to do? In general, how do external tables cope with
> the fact that they're on non-transactional storage?

Well if you use mv to replace the old file with the new one then it
should be safe. Unless your query involves opening the table multiple
times or your transactions are more complex than a single query...


--
greg

Re: postgres external table

От
Jayadevan M
Дата:
Yes. We get quite a few files as 'feeds' from external systems. Once the files are in our network, we know that no changes will happen to those files. We access them using Oracle external tables and process them (the data, after some processing, end up in other real tables). If external tables were not there, we would have had to schedule some job to load these files.
Jayadevan




From:        Greg Stark <gsstark@mit.edu>
To:        Craig Ringer <craig@postnewspapers.com.au>
Cc:        Vincenzo Romano <vincenzo.romano@notorand.it>, Tom Lane <tgl@sss.pgh.pa.us>, Greg Smith <greg@2ndquadrant.com>, Amy Smith <vah123@gmail.com>, pgsql-general@postgresql.org
Date:        01/19/2010 04:37 PM
Subject:        Re: [GENERAL] postgres external table
Sent by:        pgsql-general-owner@postgresql.org




On Tue, Jan 19, 2010 at 4:41 AM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
> How can that work without a transactional file system, though? If the
> external process writes to the file while you're half-way through reading
> it, what's the database to do? In general, how do external tables cope with
> the fact that they're on non-transactional storage?

Well if you use mv to replace the old file with the new one then it
should be safe. Unless your query involves opening the table multiple
times or your transactions are more complex than a single query...


--
greg

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general






DISCLAIMER:


"The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."





Re: postgres external table

От
Alban Hertroys
Дата:
On 19 Jan 2010, at 12:16, Jayadevan M wrote:

> Yes. We get quite a few files as 'feeds' from external systems. Once the files are in our network, we know that no
changeswill happen to those files. We access them using Oracle external tables and process them (the data, after some
processing,end up in other real tables). If external tables were not there, we would have had to schedule some job to
loadthese files.  

I'm surprised you even need to download them to your network. I suspect it would be quite possible to implement
'externaltables' so that they could be read from eg. an RSS feed. That would be a pretty slick feature. 

There's a gazillion number of data formats that people would want support for though; CSV, probably excel, RSS feeds,
webservices, etc. 

Most of that should probably be implemented by means of stored procedures (using the unsafe languages - pl/pythonu,
pl/perlu,etc) returning TABLE (...), which means this is in fact already possible I think? It's just that nobody's
(publicly)thought of doing this so far. 

Alban Hertroys

--
Screwing up is the best way to attach something to the ceiling.


!DSPAM:737,4b559f7d10601504612258!



Re: postgres external table

От
Jayadevan M
Дата:
Hi,
We use Oracle's external tables to process the files (so no need to look for the other alternatives you mentioned). We do not have access to the data providers' network to access the files. That is why we process them after we receive the files.
If there is a database implementation of 'external tables' in the way you mentioned (CSV, probably excel, RSS feeds, web services, etc) at db level, that would be great.  Most ETL tools already do accept feeds of the type you mentioned and load those into tables.
Jayadevan




From:        Alban Hertroys <dalroi@solfertje.student.utwente.nl>
To:        Jayadevan M <Jayadevan.Maymala@ibsplc.com>
Cc:        pgsql-general@postgresql.org
Date:        01/19/2010 05:31 PM
Subject:        Re: [GENERAL] postgres external table




On 19 Jan 2010, at 12:16, Jayadevan M wrote:

> Yes. We get quite a few files as 'feeds' from external systems. Once the files are in our network, we know that no changes will happen to those files. We access them using Oracle external tables and process them (the data, after some processing, end up in other real tables). If external tables were not there, we would have had to schedule some job to load these files.

I'm surprised you even need to download them to your network. I suspect it would be quite possible to implement 'external tables' so that they could be read from eg. an RSS feed. That would be a pretty slick feature.

There's a gazillion number of data formats that people would want support for though; CSV, probably excel, RSS feeds, web services, etc.

Most of that should probably be implemented by means of stored procedures (using the unsafe languages - pl/pythonu, pl/perlu, etc) returning TABLE (...), which means this is in fact already possible I think? It's just that nobody's (publicly) thought of doing this so far.

Alban Hertroys

--
Screwing up is the best way to attach something to the ceiling.


!DSPAM:1019,4b559f7b10604920110887!








DISCLAIMER:


"The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."





Re: postgres external table

От
Greg Smith
Дата:
Craig Ringer wrote:
> On 19/01/2010 1:13 AM, Vincenzo Romano wrote:
>
>> Another case, Tom, could be when the file is updated from a non-DB
>> application and you need to synchronize
>> the data with other DB applications ...
>
> How can that work without a transactional file system, though? If the
> external process writes to the file while you're half-way through
> reading it, what's the database to do? In general, how do external
> tables cope with the fact that they're on non-transactional storage?

With Oracle's implementation, you version the input files (timestamp or
something) and then do ALTER TABLE to change the location the external
table points to.  That will block waiting for exclusive access before it
fires, then you get a clean switch to the new location.  There is no
smartness here to cope with weird behavior built-in here--I expect it
will just crash the query.  One thing you always have to be careful
about when using these is that an external table might return a weird
query error under odd circumstances such as you describe, which you
might not normally expect from a simple SELECT.


--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com