Re: QSoC proposal: Rewrite pg_dump and pg_restore

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: QSoC proposal: Rewrite pg_dump and pg_restore
Дата
Msg-id 532BA367.3060604@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: QSoC proposal: Rewrite pg_dump and pg_restore  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: QSoC proposal: Rewrite pg_dump and pg_restore  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 03/21/2014 09:28 AM, Robert Haas wrote:
> On Tue, Mar 18, 2014 at 8:41 PM, Alexandr <askellio@gmail.com> wrote:
>> Rewrite (add) pg_dump and pg_restore utilities as libraries (.so, .dll &
>> .dylib)
> 
> This strikes me as (1) pretty vague and (2) probably too hard for a
> summer project.
> 
> I mean, getting the existing binaries to build libraries that you can
> call with some trivial interface that mimics the existing command-line
> functionality of pg_dump might be doable, but that's not all that
> interesting.  What people are really going to want is a library with a
> sophisticated API that lets you do interesting things
> programmatically.  But that's going to be hard.  AFAIK, nobody's even
> tried to figure out what that API should look like.  Even if we had
> that worked out, a non-trivial task, the pg_dump source code is a
> mess, so refactoring it to provide such an API is likely to be a job
> and a half.

... and still wouldn't solve one of the most frequently requested things
for pg_dump / pg_restore, which is the ability to use them *server-side*
over a regular PostgreSQL connection. It'd be useful progress toward
that, though.

Right now, we can't even get the PostgreSQL server to emit DDL for a
table, let alone do anything more sophisticated.

Here's how I think it needs to look:

- Design a useful API for pg_dump and pg_restore that is practical to use for pg_dump and pg_restore's current tasks
(fastdatabase dump/restore) and also useful for extracting specific objects from the database. When designing, consider
thatwe'll want to expose this API or functions that use it over SQL later.
 

- Create a new "libpqdump" library.

- Implement the designed API in the new library, moving and adjusting code from pg_dump / pg_restore where possible,
writingnew code where not.
 

- Refactor (closer to rewrite) pg_dump and pg_restore to use libpqdump, removing as much knowledge of the system
catalogsetc as possible from them.
 

- Make sure the result still performs OK

THEN, once that's settled in:

- Modify libpqdump to support compilation as a backend extension, with use of the SPI for queries and use of syscaches
ordirect scans where possible.
 

- Write a "pg_dump" extension that uses libpqdump in SPI mode to expose its API over SQL, or at least uses it to
provideSQL functions to describe database objects. So you can dump a DB, or a subset of it, over SQL.
 

After all, a "libpgdump" won't do much good for the large proportion of
PostgreSQL users who use Java/JDBC, who can't use a native library
(without hideous hacks with JNI). For the very large group who use libpq
via language-specific client interfaces like the Pg gem for Ruby,
psycopg2 for Python, DBD::Pg for Perl, etc, it'll require a lot of work
to wrap the API and maintain it. Wheras a server-side SQL-callable
interface would be useful and immediately usable for all of them.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re:
Следующее
От: Tom Lane
Дата:
Сообщение: Re: QSoC proposal: Rewrite pg_dump and pg_restore