Обсуждение: Copying large tables with DBLink

Поиск
Список
Период
Сортировка

Copying large tables with DBLink

От
"Chris Hoover"
Дата:
Has anyone had problems with memory exhaustion and dblink?  We were
trying to use dblink to convert our databases to our new layout, and had
our test server lock up several times when trying to copy a table that
was significantly larger than our memory and swap.

Basically where were doing an insert into <table> select * from
dblink('dbname=olddb','select * from large_table) as t_large_table(table
column listing);

Does anyone know of a way around this?  The problem we were trying to
solve is the fact that due to new column additions that are populated
during the conversion, our db's are doubling in space.  This is
requiring lengthy vacuum fulls to reclaim the space and making for a
very long conversion time.

We are very concerned that this system lockup could happen on our
production boxes since we have several db's that are very large and
probably have tables larger than our memory on the systems.

Anyway, any ideas on how to get around this, or how we might speed it up
and not use so much space would be appreciated.

Chris

PG 7.3.4


Re: Copying large tables with DBLink

От
Joe Conway
Дата:
Chris Hoover wrote:
> Has anyone had problems with memory exhaustion and dblink?  We were
> trying to use dblink to convert our databases to our new layout, and had
> our test server lock up several times when trying to copy a table that
> was significantly larger than our memory and swap.
> Basically where were doing an insert into <table> select * from
> dblink('dbname=olddb','select * from large_table) as t_large_table(table
> column listing);
>
> Does anyone know of a way around this?


dblink just uses libpq, and libpq reads the entire result into memory.
There is no direct way around that that I'm aware of. You could,
however, use a cursor, and fetch/manipulate rows in more reasonably
sized groups.

HTH,

Joe

Re: Copying large tables with DBLink

От
Tom Lane
Дата:
"Chris Hoover" <revoohc@sermonaudio.com> writes:
> Has anyone had problems with memory exhaustion and dblink?  We were
> trying to use dblink to convert our databases to our new layout, and had
> our test server lock up several times when trying to copy a table that
> was significantly larger than our memory and swap.

You're not going to be able to do that with dblink, nor any other
set-returning function, because the current implementation of SRFs
always materializes the entire function result in temporary memory/swap.

Consider something like
    pg_dump -t srctab srcdb | psql destdb
instead.

            regards, tom lane

Re: Copying large tables with DBLink

От
Michael Fuhr
Дата:
On Thu, Mar 24, 2005 at 01:59:44PM -0500, Chris Hoover wrote:
>
> Has anyone had problems with memory exhaustion and dblink?  We were
> trying to use dblink to convert our databases to our new layout, and had
> our test server lock up several times when trying to copy a table that
> was significantly larger than our memory and swap.

Hmmm...doesn't dblink use libpq, and doesn't libpq fetch the entire
result set before doing anything with it?  If so, then that could
explain the memory exhaustion.

> Basically where were doing an insert into <table> select * from
> dblink('dbname=olddb','select * from large_table) as t_large_table(table
> column listing);
>
> Does anyone know of a way around this?

How about using pg_dump to dump the original table and restore it
into the new table?  If you just want the table's contents without
the table definition then you could use the -a (--data-only) option.

Another possibility would be to write a function that uses a cursor:
dblink_open() and a loop that calls dblink_fetch() until you reach
the end of the result set.  I think that wouldn't have a memory
exhaustion problem (but test it to be sure).

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/