Re: [GSOC] questions about idea "rewrite pg_dump as library"

Поиск
Список
Период
Сортировка
От Hannu Krosing
Тема Re: [GSOC] questions about idea "rewrite pg_dump as library"
Дата
Msg-id 5167F905.1020108@2ndQuadrant.com
обсуждение исходный текст
Ответ на Re: [GSOC] questions about idea "rewrite pg_dump as library"  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [GSOC] questions about idea "rewrite pg_dump as library"  (Joel Jacobson <joel@trustly.com>)
Список pgsql-hackers
On 04/11/2013 12:17 AM, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> Hannu Krosing wrote:
>>> Natural solution to this seems to move most of pg_dump functionality
>>> into backend as functions, so we have pg_dump_xxx() for everything
>>> we want to dump plus a topological sort function for getting the
>>> objects in right order.
>> This idea doesn't work because of back-patch considerations (i.e. we
>> would not be able to create the functions in back branches, and so this
>> new style of pg_dump would only work with future server versions).  So
>> pg_dump itself would have to retain capability to dump stuff from old
>> servers.  This seems unlikely to fly at all, because we'd be then
>> effectively maintaining pg_dump in two places, both backend and the
>> pg_dump source code.
> There are other issues too, in particular that most of the backend's
> code tends to work on SnapshotNow time whereas pg_dump would really
> prefer it was all done according to the transaction snapshot.
I was just thinking of moving the queries the pg_dump currently
uses into UDF-s, which do _not_ use catalog cache, but will use
the same SQL to query catalogs as pg_dump currently does
using whatever snapshot mode is currently set .

the pg_dump will need to still have the same queries for older
versions of postgresql but for new versions pg_dump  can become
catalog-agnostic.

and I think that we can retire pg_dump support for older
postgresql versions the same way we drop support for
older versions of postgresql itself.

Hannu

> We have
> got bugs of that ilk already in pg_dump, but we shouldn't introduce a
> bunch more.  Doing this right would therefore mean that we'd have to
> write a lot of duplicative code in the backend, ie, it's not clear that
> we gain any synergy by pushing the functionality over.  It might
> simplify cross-backend-version issues (at least for backend versions
> released after we'd rewritten all that code) but otherwise I'm afraid
> it'd just be pushing the problems somewhere else.
>
> In any case, "push it to the backend" offers no detectable help with the
> core design issue here, which is figuring out what functionality needs
> to be exposed with what API.
main things I see would be
 * get_list_of_objects(object_type, pattern or namelist) * get_sql_def_for_object(object_type, object_name) *
sort_by_dependency(listof [obj_type, obj_name])
 

from this you could easily construct most uses, especially if
sort_by_dependency(list of [obj_type, obj_name])
would be smart enough to break circular dependencies, like
turning to tables with mutual FK-s into tabledefs without
FKs + separate constraints.

Or we could always have constraints separately, so that
the ones depending on non-exported objects would be easy
to leave out

My be the dependency API analysis itself is something
worth a GSOC effort ?

Hannu
>
>             regards, tom lane




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Inconsistent DB data in Streaming Replication
Следующее
От: Christoph Berg
Дата:
Сообщение: Re: [PATCH] pg_regress and non-default unix socket path