external data sources (e.g. a second RDBMS)

Поиск
Список
Период
Сортировка
От Christof Petig
Тема external data sources (e.g. a second RDBMS)
Дата
Msg-id 3882844E.B6777D3C@wtal.de
обсуждение исходный текст
Список pgsql-hackers
Dear hackers,

I have to port a lot of programs to PostgreSQL and don't want to wait
for completing them all before I put the first into production.

So after some thought I came to an enthrilling vision

If PostgreSQL had an interface for transparently accessing external
data, I could leave the needed tables on the old RDBMS and write a
wrapper which would provide access to the data on the old RDBMS. Both
old and new (pgsql) programs would see the same data. And I could start
using postgres' fine tools (e.g. psql) _now_ for all of my work. 

Putting these short sighted things aside it would provide an opportunity
for
- distributed databases
- load balancing via putting a table on multiple servers, creating a
unioning-view
- you could use SQL to access other data sources (e.g. LDAP)
- one step further to world domination (no matter where your data
resides, you could     use postgres for your queries)
- accessing multiple databases in _one_ SQL statement (e.g. join)

So I ask for your opinion on this strange idea. 

I suggest marking a table as external uses an interface which should
provide the following methods
- query the structure of the table (\d table-name)
- sequentially scan the table (returning selected attributes of each
tuple) [with some conditions]
- update/delete either by cursor or by where-condition
- query for statistics (see below)

Since this functionality would require modifications all over the place
in postgres I would like to start discussion about it. I might overlook
something but the thought of having such a thing around opens up a lot
of opportunities.

Perhaps an ODBC wrapper could be the correct point to start (after
implementing and testing the [additional] virtual table access layer)
   Christof

PS: I'd never dare to depend on this functionality, some kind of
mirroring program might cover the problem for me as well, but it looked
so cool.
PPS: Of course I would start to investigate it further _after_ TOAST is
finished.

------------------- example ------------------------------------

[a is an external table, b an internal table]

select tuple_a1,tuple_b1 from a,b where a.tuple_a2=b.tuple_b2 and
tuple_a3=42

would cause-> sequential scan (tuple_a1,tuple_a2) on a where tuple_a3=42-> for each entry index scan on b for
a.tuple_a2=b.tuple_b2->report result 
 
or-> sequential scan (tuple_b2, tuple_b1) on b-> index_scan (tuple_a1) by (tuple_a3=42,tuple_a2=b.tuple_b2)-> report
result

clearly we need to collect statistics on external tables as well




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] flex
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] pg_dump not in very good shape