Re: False "pg_serial": apparent wraparound” in logs

Поиск
Список
Период
Сортировка
От Imseih (AWS), Sami
Тема Re: False "pg_serial": apparent wraparound” in logs
Дата
Msg-id 5991E478-2DE4-4CE0-98D6-62F090F3505B@amazon.com
обсуждение исходный текст
Ответ на False "pg_serial": apparent wraparound” in logs  ("Imseih (AWS), Sami" <simseih@amazon.com>)
Ответы Re: False "pg_serial": apparent wraparound” in logs
Список pgsql-hackers

Hi,

 

I dug a bit into this and what looks to be happening is the comparison

of the page containing the latest cutoff xid could falsely be reported

as in the future of the last page number because the latest

page number of the Serial slru is only set when the page is

initialized [1].

 

So under the correct conditions, such as in the repro, the serializable

XID has moved past the last page number, therefore to the next checkpoint

which triggers a CheckPointPredicate, it will appear that the slru

has wrapped around.

 

It seems what may be needed here is to advance the

latest_page_number during SerialSetActiveSerXmin and if

we are using the SLRU. See below:

 

 

diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c

index 1af41213b4..6946ed21b4 100644

--- a/src/backend/storage/lmgr/predicate.c

+++ b/src/backend/storage/lmgr/predicate.c

@@ -992,6 +992,9 @@ SerialSetActiveSerXmin(TransactionId xid)

 

        serialControl->tailXid = xid;

 

+       if (serialControl->headPage > 0)

+               SerialSlruCtl->shared->latest_page_number = SerialPage(xid);

+

        LWLockRelease(SerialSLRULock);

}

 

[1] https://github.com/postgres/postgres/blob/master/src/backend/access/transam/slru.c#L306

 

Regards,

 

Sami

 

From: "Imseih (AWS), Sami" <simseih@amazon.com>
Date: Tuesday, August 22, 2023 at 7:56 PM
To: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Subject: False "pg_serial": apparent wraparound” in logs

 

Hi,

 

I Recently encountered a situation on the field in which the message

“could not truncate directory "pg_serial": apparent wraparound”

was logged even through there was no danger of wraparound. This

was on a brand new cluster and only took a few minutes to see

the message in the logs.

 

Reading on some history of this error message, it appears that there

was work done to improve SLRU truncation and associated wraparound

log messages [1]. The attached repro on master still shows that this message

can be logged incorrectly.

 

The repro runs updates with 90 threads in serializable mode and kicks

off a “long running” select on the same table in serializable mode.

 

As soon as the long running select commits, the next checkpoint fails

to truncate the SLRU and logs the error message.

 

Besides the confusing log message, there may also also be risk with

pg_serial getting unnecessarily bloated and depleting the disk space.

 

Is this a bug?

 

[1] https://www.postgresql.org/message-id/flat/20190202083822.GC32531%40gust.leadboat.com

 

Regards,

 

Sami Imseih

Amazon Web Services (AWS)

 

 

 

 

 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Mark Woodward
Дата:
Сообщение: Re: Let's make PostgreSQL multi-threaded
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #18059: Unexpected error 25001 in stored procedure