SIGQUIT and lost sequence WAL records

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема SIGQUIT and lost sequence WAL records
Дата
Msg-id CAB7nPqQ6+0LNkitj507eDBMh-OBdm_1_X-LmLOQfsjiWXe9SvA@mail.gmail.com
обсуждение исходный текст
Ответы Re: SIGQUIT and lost sequence WAL records
Список pgsql-bugs
Hi all,

Lately a couple of colleagues have pinged me regarding the fact that a
server signaled with SIGQUIT, or "pg_ctl stop -m immediate", may
ignore sequence WAL records. See for example the script attached
seqtest.py (simplified test case provided from my colleague Nikhil
Deshpande), that will fail while for example the following command is
run in parallel:
for i in {1..1000}; do pg_ctl restart -m immediate -w; sleep 1; done
The failure happening being for example:
AssertionError: DB returned reused sequence value prev=1118 curr=958
Here it means that the last return value of nextval() was 958, even if
the highest value ever fetched was 1118. Note that this test script
uses psycopg2, but that it does *not* commit transactions on purpose.
Also note that those WAL records do not disappear all the time.
Sometimes they are here and the sequence keeps increasing monotically
after server is restarted.

Trying to find out if there were records ignored I added some logs in
the server thanks to the patch attached. In the case of this last
failure there have been a couple of records after the last one found
at recovery:
WARNING:  Logged new sequence value at 0/1505570 for nextval = 957
WARNING:  Logged new sequence value at 0/15066D8 for nextval = 990
WARNING:  Logged new sequence value at 0/1507828 for nextval = 1023
WARNING:  Logged new sequence value at 0/1508990 for nextval = 1056
WARNING:  Logged new sequence value at 0/1509AE0 for nextval = 1089
WARNING:  Logged new sequence value at 0/150AC48 for nextval = 1122
And if I looked at the WAL records those were not of course not
present. That's not really surprising as recovery didn't use them to
restore the sequence up to where it should have.

func.sgml mentions that a nextval() operation is *never* rolled back
even if a transaction is aborted, though in this case this is exactly
what is happening, even if when the server is stopped the existing
transactions running are forcibly aborted.

I am lacking of fuel to look more at that today, but at this point I
wanted to get some input to see if I am missing something obvious. So,
code bug or documentation bug?
--
Michael

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Teodor Sigaev
Дата:
Сообщение: Re: BUG #13440: unaccent does not remove all diacritics
Следующее
От: Tom Lane
Дата:
Сообщение: Re: SIGQUIT and lost sequence WAL records