Re: [BUG] Archive recovery failure on 9.3+.

Поиск

Список

Период

Сортировка

От	Tomonari Katsumata
Тема	Re: [BUG] Archive recovery failure on 9.3+.
Дата	9 января 2014 г. 18:13:21
Msg-id	CAC55fYf+=zf+xpgJKvFSCH9YaxJpTuaQLCOpAB9cqti-zx3zCg@mail.gmail.com обсуждение исходный текст
Ответ на	[BUG] Archive recovery failure on 9.3+. (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Список	pgsql-hackers

Дерево обсуждения

Hi,

Somebody is reading this thread?

This problem seems still remaining on REL9_3_STABLE.

Many users would face this problem, so we should
resolve this in next release.

I think his patch is reasonable to fix this problem.

Please check this again.

regards,
--------------------------

Tomonari Katsumata

2013/12/12 Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>

Hello, we happened to see server crash on archive recovery under
some condition.

After TLI was incremented, there should be the case that the WAL
file for older timeline is archived but not for that of the same
segment id but for newer timeline. Archive recovery should fail
for the case with PANIC error like follows,

| PANIC: record with zero length at 0/1820D40

Replay script is attached. This issue occured for 9.4dev, 9.3.2,
and not for 9.2.6 and 9.1.11. The latter search pg_xlog for the
TLI before trying archive for older TLIs.

This occurrs during fetching checkpoint redo record in archive
recovery.

> if (checkPoint.redo < RecPtr)
> {
> /* back up to find the record */
> record = ReadRecord(xlogreader, checkPoint.redo, PANIC, false);

And this is caused by that the segment file for older timeline in
archive directory is preferred to that for newer timeline in
pg_xlog.

Looking into pg_xlog before trying the older TLIs in archive like
9.2- fixes this issue. The attached patch is one possible
solution for 9.4dev.

Attached files are,

- recvtest.sh: Replay script. Step 1 and 2 makes the condition
and step 3 causes the issue.

- archrecvfix_20131212.patch: The patch fixes the issue. Archive
recovery reads pg_xlog before trying older TLI in archive
similarly to 9.1- by this patch.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Dean Rasheed
Дата: 09 января 2014 г., 18:09:53
Сообщение: Re: [PATCH] Negative Transition Aggregate Functions (WIP)

Следующее

От: "MauMau"
Дата: 09 января 2014 г., 18:15:25
Сообщение: Re: Standalone synchronous master

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [BUG] Archive recovery failure on 9.3+.

Предыдущее

Следующее