Re: B-tree parent pointer and checkpoints

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: B-tree parent pointer and checkpoints
Дата
Msg-id 4CD7FDBA.1020506@enterprisedb.com
обсуждение исходный текст
Ответ на Re: B-tree parent pointer and checkpoints  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Ответы Re: B-tree parent pointer and checkpoints  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Список pgsql-hackers
On 02.11.2010 16:40, Heikki Linnakangas wrote:
> On 02.11.2010 16:30, Tom Lane wrote:
>> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:
>>> I think we can fix this by requiring that any multi-WAL-record actions
>>> that are in-progress when a checkpoint starts (at the REDO-pointer) must
>>> finish before the checkpoint record is written.
>>
>> What happens if someone wants to start a new split while the checkpoint
>> is hanging fire?
>
> You mean after CreateCheckPoint has determined the redo pointer, but
> before it has written the checkpoint record? The new split can go ahead,
> and the checkpoint doesn't need care about it. Recovery will start at
> the redo pointer, so it will see the split record, and will know to
> finish the incomplete split if necessary.
>
> The logic is the same as with inCommit. Checkpoint will fetch the list
> of in-progress splits some time after determining the redo-pointer. It
> will then wait until all of those splits have finished. Any new splits
> that begin after fetching the list don't affect the checkpoint.
>
> inCommit can't be used as is, because it's tied to the Xid, but
> something similar should work.

Here's a first draft of this, using the inCommit flag as is. It works,
but suffers from starvation if you have a lot of concurrent
multi-WAL-record actions. I tested that by running INSERTs to a table
with tsvector field with a GiST index on it from five concurrent
sessions, and saw checkpoints regularly busy-waiting for over a minute.

To avoid that, we need something a little bit more complicated than a
boolean flag. I'm thinking of adding a counter beside the inCommit flag
that's incremented every time a new multi-WAL-record action begins, so
that the checkpoint process can distinguish between a new action that
was started after deciding the REDO pointer and an old one that's still
running.

(inCommit is a misnomer now, of course. Will need to find a better name..)

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Shigeru HANADA
Дата:
Сообщение: Re: SQL/MED estimated time of arrival?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: why does plperl cache functions using just a bool for is_trigger