Re: WIP patch for parallel pg_dump

Поиск
Список
Период
Сортировка
От Joachim Wieland
Тема Re: WIP patch for parallel pg_dump
Дата
Msg-id AANLkTinVTb7JcsHg37nOT15a+28DBK6gY0NEeOoE5XJy@mail.gmail.com
обсуждение исходный текст
Ответ на Re: WIP patch for parallel pg_dump  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: WIP patch for parallel pg_dump  (Koichi Suzuki <koichi.szk@gmail.com>)
Список pgsql-hackers
On Sun, Dec 5, 2010 at 9:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Dec 5, 2010 at 9:04 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Why not just say give me the snapshot currently held by process nnnn?
>>
>> And please, not temp files if possible.
>
> As far as I'm aware, the full snapshot doesn't normally exist in
> shared memory, hence the need for publication of some sort.  We could
> dedicate a shared memory region for publication but then you have to
> decide how many slots to allocate, and any number you pick will be too
> many for some people and not enough for others, not to mention that
> shared memory is a fairly precious resource.

So here is a patch that I have been playing with in the past, I have
done it a while back and thanks go to Koichi Suzuki for his helpful
comments. I have not published it earlier because I haven't worked on
it recently and from the discussion that I brought up in march I got
the feeling that people are fine with having a first version of
parallel dump without synchronized snapshots.

I am not really sure that what the patch does is sufficient nor if it
does it in the right way but I hope that it can serve as a basis to
collect ideas (and doubt).

My idea is pretty much similar to Robert's about publishing snapshots
and subscribing to them, the patch even uses these words.

Basically the idea is that a transaction in isolation level
serializable can publish a snapshot and as long as this transaction is
alive, its snapshot can be adopted by other transactions. Requiring
the publishing transaction to be serializable guarantees that the copy
of the snapshot in shared memory is always current. When the
transaction ends, the copy of the snapshot is also invalidated and
cannot be adopted anymore. So instead of doing explicit checks, the
patch aims at always having a reference transaction around that
guarantees validity of the snapshot information in shared memory.

The patch currently creates a new area in shared memory to store
snapshot information but we can certainly discuss this... I had a GUC
in mind that can control the number of available "slots", similar to
max_prepared_transactions. Snapshot information can become quite
large, especially with a high number of max_connections.

Known limitations: the patch is lacking awareness of prepared
transactions completely and doesn't check if both backends belong to
the same user.


Joachim

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Chernow
Дата:
Сообщение: Re: Suggesting a libpq addition
Следующее
От: "David E. Wheeler"
Дата:
Сообщение: Re: Review: Extensions Patch