Re: Bug #613: Sequence values fall back to previously checkpointed

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Bug #613: Sequence values fall back to previously checkpointed
Дата
Msg-id 8690.1016033495@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Bug #613: Sequence values fall back to previously checkpointed  (Bruce Momjian <pgman@candle.pha.pa.us>)
Ответы Re: Bug #613: Sequence values fall back to previously checkpointed  (Ben Grimm <bgrimm@zaeon.com>)
Список pgsql-bugs
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I can confirm repeatable case!

Ah-hah, now I understand the problem.  I think sequences are suffering
from premature optimization.

The crux of the matter: sequences try to avoid generating a WAL record
for every single nextval().  The idea is to generate a WAL record
every SEQ_LOG_VALS (32) nextval operations.  If you crash and replay
the WAL, then when you see the WAL record you don't know exactly how
many nextval operations got done, so for safety you set the sequence
value to where-it-was plus 32.  This may leave a gap in the sequence
number assignment, but it's no worse than aborting a few transactions.

The problem in the scenario Bruce exhibits is that the CHECKPOINT
forces out both the latest sequence WAL record and the current state
of the sequence relation itself.  The subsequent nextval()'s advance
the sequence relation in-memory but generate no disk writes and no
WAL records.  On restart, you lose: the sequence relation is back
to where it was checkpointed, and the latest WAL record for the
sequence is before the checkpoint *so it won't get rescanned*.
Thus, the sequence doesn't get pushed forward like it's supposed to.

AFAICS the only way that we could make the one-WAL-record-every-32-
nextvals idea really work would be if CHECKPOINT could nullify the
logged-in-advance state of each sequence (so that the first nextval
after a checkpoint would always generate a fresh WAL record, but
subsequent ones wouldn't have to).  But I don't see any practical
way for CHECKPOINT to do that, especially not for sequences whose
disk block isn't even in memory at the instant of the CHECKPOINT.

Accordingly, I'm thinking that we must remove the SEQ_LOG_VALS
functionality and force one-WAL-record-per-nextval operation.

Vadim, do you see another way?  This was a cool idea and I hate
to throw it away...

            regards, tom lane

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Laurent FAILLIE
Дата:
Сообщение: Re: Bug #608: cache lookup failed
Следующее
От: Stephan Szabo
Дата:
Сообщение: Re: referential constraint bug