Re: POC: Cleaning up orphaned files using undo logs

Поиск
Список
Период
Сортировка
От Kuntal Ghosh
Тема Re: POC: Cleaning up orphaned files using undo logs
Дата
Msg-id CAGz5QC+wp836uMbyEMk51PFhpMHTnpe3aK+fPzRQfhPK5=FXig@mail.gmail.com
обсуждение исходный текст
Ответ на Re: POC: Cleaning up orphaned files using undo logs  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: POC: Cleaning up orphaned files using undo logs  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
Hello Thomas,

While testing zheap over undo apis, we've found the following
issues/scenarios that might need some fixes/discussions:

1. In UndoLogAllocateInRecovery, when we find the current log number
from the list of registered blocks, we don't check whether the
block->in_use flag is true or not. In XLogResetInsertion, we just
reset in_use flag without reseting the blocks[]->rnode information.
So, if we don't check the in_use flag, it's possible that we'll
consult some block information from the previous WAL record. IMHO,
just adding an in_use check in UndoLogAllocateInRecovery will solve
the problem.

2. A transaction, inserts one undo record and generated a WAL record
for the same, say at WAL location 0/2000A000. Next, the undo record
gets discarded and WAL is generated to update the meta.discard pointer
at location 0/2000B000  At the same time, an ongoing checkpoint with
checkpoint.redo at 0/20000000 flushes the latest meta.discard pointer.
Now, the system crashes.
Now, the recovery starts from the location 0/20000000. When the
recovery of 0/2000A000 happens, it sees the undo record that it's
about to insert, is already discarded as per meta.discard (flushed by
checkpoint). In this case, should we just skip inserting the undo
record?

3. Currently, we create a backup image of the unlogged part of the
undo log's metadata only when some backend allocates some space from
the undo log (in UndoLogAllocate). This helps us restore the unlogged
meta part after a checkpoint.
When we perform an undo action, we also update the undo action
progress and emit an WAL record. The same operation can performed by
the undo worker which doesn't allocate any space from the undo log.
So, if an undo worker emits an WAL record to update undo action
progress after a checkpoint, it'll not be able to WAL log the backup
image of the meta unlogged part. IMHO, this breaks the recovery logic
of unlogged part of undo meta.

Thoughts?

On Mon, Sep 2, 2019 at 9:47 AM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Fri, Aug 30, 2019 at 8:27 PM Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:
> > I'm getting the following assert failure while performing the recovery
> > with the same.
> > "TRAP: FailedAssertion("slot->meta.status == UNDO_LOG_STATUS_FULL",
> > File: "undolog.c", Line: 997)"
> >
> > I found that we don't emit an WAL record when we update the
> > slot->meta.status as UNDO_LOG_STATUS_FULL. If we don't that, after
> > crash recovery, some new transaction may use that undo log which is
> > wrong, IMHO. Am I missing something?
>
> Thanks, right, that status logging is wrong, will fix in next version.
>
> --
> Thomas Munro
> https://enterprisedb.com



-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: Primary keepalive message not appearing in Logical Streaming Replication
Следующее
От: Paul Draper
Дата:
Сообщение: Re: Implementing Incremental View Maintenance