Re: pgsql: Improve runtime and output of tests for replication slots checkp
От | Alexander Korotkov |
---|---|
Тема | Re: pgsql: Improve runtime and output of tests for replication slots checkp |
Дата | |
Msg-id | CAPpHfdurV-j_e0pb=UFENAy3tyzxfF+yHveNDNQk2gM82WBU5A@mail.gmail.com обсуждение исходный текст |
Ответ на | pgsql: Improve runtime and output of tests for replication slots checkp (Alexander Korotkov <akorotkov@postgresql.org>) |
Ответы |
Re: pgsql: Improve runtime and output of tests for replication slots checkp
|
Список | pgsql-committers |
On Sat, Jun 21, 2025 at 2:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Alexander Korotkov <aekorotkov@gmail.com> writes: > > And I see the following variable values. > > > (lldb) p/x targetPagePtr > > (XLogRecPtr) 0x0000000029004000 > > (lldb) p/x RecPtr > > (XLogRecPtr) 0x0000000029002138 > > > I hardly understand how is this possible given it was compiled with "-O0". > > I'm planning to continue investigating this tomorrow. > > Yeah, I see > > (lldb) p/x targetPagePtr > (XLogRecPtr) 0x0000000029004000 > (lldb) p/x RecPtr > (XLogRecPtr) 0x0000000029002138 > (lldb) p/x RecPtr - (RecPtr % 8192) > (XLogRecPtr) 0x0000000029002000 > > We're here: > > /* Calculate pointer to beginning of next page */ > targetPagePtr += XLOG_BLCKSZ; > > /* Wait for the next page to become available */ > readOff = ReadPageInternal(state, targetPagePtr, > Min(total_len - gotlen + SizeOfXLogShortPHD, > XLOG_BLCKSZ)); > > so that's where the increment of targetPagePtr came from. > But "Wait for the next page to become available" seems awfully > trusting that there will be another page. Should this be > using the no-wait code path? Thank you for the help. It seems to me that problem is deeper. The code seems to only trying to read till the end of given WAL record, but can't reach it. According to the values I've seen in XLogCtl, it seems that RedoRecPtr points somewhere inside of that record's body. I don't feel confident about to understand what's going on and how to fix it. I've tried two things. 1) slot_tests_wait_for_checkpoint.patch Make tests wait for checkpoint completion (as I think they were originally intended). However, the problem still persists. 2) revert_slot_last_saved_restart_lsn.patch Revert ca307d5cec90 and make new tests reserve WAL using wal_keep_size GUC. The problem still persists. It seems to be some problem independent to my attempts to fix retaining WAL files with slot's restart_lsn. The new tests just spotted the existing bug. ------ Regards, Alexander Korotkov Supabase
Вложения
В списке pgsql-committers по дате отправления: