Re: POC: Cleaning up orphaned files using undo logs

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: POC: Cleaning up orphaned files using undo logs
Дата
Msg-id CAA4eK1L_7TcMUx4SY144J_kEwzn+6-5BR_6t4-jUKeM+b++PLQ@mail.gmail.com
обсуждение исходный текст
Ответ на POC: Cleaning up orphaned files using undo logs  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: POC: Cleaning up orphaned files using undo logs  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
Thomas told me offlist that this email of mine didn't hit
pgsql-hackers, so trying it again by resending.

On Mon, Apr 29, 2019 at 3:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Apr 19, 2019 at 3:46 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Mar 12, 2019 at 6:51 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > >
> >
> > Currently, undo branch[1] contain an older version of the (undo
> > interface + some fixup).  Now, I have merged the latest changes from
> > the zheap branch[2] to the undo branch[1]
> > which can be applied on top of the undo storage commit[3].  For
> > merging those changes, I need to add some changes to the undo log
> > storage patch as well for handling the multi log transaction.  So I
> > have attached two patches, 1) improvement to undo log storage 2)
> > complete undo interface patch which include 0006+0007 from undo
> > branch[1] + new changes on the zheap branch.
> >
> > [1] https://github.com/EnterpriseDB/zheap/tree/undo
> > [2] https://github.com/EnterpriseDB/zheap
> > [3] https://github.com/EnterpriseDB/zheap/tree/undo
> > (b397d96176879ed5b09cf7322b8d6f2edd8043a5)
> >
> > []
>
> Dilip has posted the patch for "undo record interface", next in series
> is a patch that handles transaction rollbacks (the machinery to
> perform undo actions) and background workers to manage undo.
>
> Transaction Rollbacks
> ----------------------------------
> We always perform rollback actions after cleaning up the current
> (sub)transaction.  This will ensure that we perform the actions
> immediately after an error (and release the locks) rather than when
> the user issues Rollback command at some later point of time.  We are
> releasing the locks after the undo actions are applied.  The reason to
> delay lock release is that if we release locks before applying undo
> actions, then the parallel session can acquire the lock before us
> which can lead to deadlock.
>
> We promote the error to FATAL error if it occurred while applying undo
> for a subtransaction.  The reason we can't proceed without applying
> subtransaction's undo is that the modifications made in that case must
> not be visible even if the main transaction commits.  Normally, the
> backends that receive the request to perform Rollback (To Savepoint)
> applies the undo actions, but there are cases where it is preferable
> to push the requests to background workers.  The main reasons to push
> the requests to background workers are (a) The request for a very
> large rollback, this will allow us to return control to users quickly.
> There is a guc rollback_overflow_size which indicates that rollbacks
> greater than the configured size are performed lazily by background
> workers. (b) While applying the undo actions, if there is an error, we
> push such a request to background workers.
>
> Undo Requests and Undo workers
> --------------------------------------------------
> To improve the efficiency of the rollbacks, we create three queues and
> a hash table for the rollback requests.  A Xid based priority queue
> which will allow us to process the requests of older transactions and
> help us to move oldesdXidHavingUndo (this is a xid-horizon below which
> all the transactions are visible) forward.  A size-based queue which
> will help us to perform the rollbacks of larger aborts in a timely
> fashion so that we don't get stuck while processing them during
> discard of the logs.  An error queue to hold the requests for
> transactions that failed to apply its undo.  The rollback hash table
> is used to avoid duplicate undo requests by backends and discard
> worker.
>
> Undo launcher is responsible for launching the workers iff there is
> some work available in one of the work queues and there are more
> workers available.  The worker is launched to handle requests for a
> particular database.  Each undo worker then start reading from one of
> the queues the requests for that particular database.  A worker would
> peek into each queue for the requests from a particular database if it
> needs to switch a database in less than undo_worker_quantum ms (10s as
> default) after starting. Also, if there is no work, it lingers for
> UNDO_WORKER_LINGER_MS (10s as default).  This avoids restarting the
> workers too frequently.
>
> The discard worker is responsible for discarding the undo log of
> transactions that are committed and all-visible or are rolled-back.
> It also registers the request for aborted transactions in the work
> queues.  It iterates through all the active logs one-by-one and tries
> to discard the transactions that are old enough to matter.
>
> The details of how all of this works are described in
> src/backend/access/undo/README.UndoProcessing.  The main idea to keep
> a readme is to allow reviewers to understand this patch, later we can
> decide parts of it to move to comments in code and others to main
> README of undo.
>
> Question:  Currently, TwoPhaseFileHeader stores just TransactionId, so
> for the systems (like zheap) that support FullTransactionId, the
> two-phase transaction will be tricky to support as we need
> FullTransactionId during rollbacks.  Is it a good idea to store
> FullTransactionId in TwoPhaseFileHeader?
>
> Credits:
> --------------
> Designed by: Andres Freund, Amit Kapila, Robert Haas, and Thomas Munro
> Author: Amit Kapila, Dilip Kumar, Kuntal Ghosh, and Thomas Munro
>
> This patch is based on the latest Dilip's patch for undo record
> interface.  The branch can be accessed at
> https://github.com/EnterpriseDB/zheap/tree/undoprocessing
>
> Inputs on design/code are welcome.
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Unhappy about API changes in the no-fsm-for-small-rels patch
Следующее
От: Tom Lane
Дата:
Сообщение: Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6