Re: [HACKERS] logical decoding of two-phase transactions

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [HACKERS] logical decoding of two-phase transactions
Дата
Msg-id 20180726200241.aje4dv4jsv25v4k2@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [HACKERS] logical decoding of two-phase transactions  (Nikhil Sontakke <nikhils@2ndquadrant.com>)
Ответы Re: [HACKERS] logical decoding of two-phase transactions  (Nikhil Sontakke <nikhils@2ndquadrant.com>)
Список pgsql-hackers
On 2018-07-26 20:24:00 +0530, Nikhil Sontakke wrote:
> Hi,
> 
> >> I think we even can just do something like a global
> >> TransactionId check_if_transaction_is_alive = InvalidTransactionId;
> >> and just set it up during decoding. And then just check it whenever it's
> >> not set tot InvalidTransactionId.
> >>
> >>
> >
> > Ok. I will work on something along these lines and re-submit the set of patches.

> PFA, latest patchset, which completely removes the earlier
> LogicalLock/LogicalUnLock implementation using groupDecode stuff and
> uses the newly suggested approach of checking the currently decoded
> XID for abort in systable_* API functions. Much simpler to code and
> easier to test as well.

So, leaving the fact that it might not actually be correct aside ;), you
seem to be ok with the approach?


> Out of the patchset, the specific patch which focuses on the above
> systable_* API based XID checking implementation is part of
> 0003-Gracefully-handle-concurrent-aborts-of-uncommitted-t.patch. So,
> it might help to take a look at this patch first for any additional
> feedback on this approach.

K.


> There's an additional test case in
> 0005-Additional-test-case-to-demonstrate-decoding-rollbac.patch which
> uses a sleep in the "change" plugin API to allow a concurrent rollback
> on the 2PC being currently decoded. Andres generally doesn't like this
> approach :-), but there are no timing/interlocking issues now, and the
> sleep just helps us do a concurrent rollback, so it might be ok now,
> all things considered. Anyways, it's an additional patch for now.

Yea, I still don't think it's ok. The tests won't be reliable.  There's
ways to make this reliable, e.g. by forcing a lock to be acquired that's
externally held or such. Might even be doable just with a weird custom
datatype.


> From 75edeb440794fff7de48082dafdecb065940bee5 Mon Sep 17 00:00:00 2001
> From: Nikhil Sontakke <nikhils@2ndQuadrant.com>
> Date: Thu, 26 Jul 2018 18:45:26 +0530
> Subject: [PATCH 3/5] Gracefully handle concurrent aborts of uncommitted
>  transactions that are being decoded alongside.
> 
> When a transaction aborts, it's changes are considered unnecessary for
> other transactions. That means the changes may be either cleaned up by
> vacuum or removed from HOT chains (thus made inaccessible through
> indexes), and there may be other such consequences.
> 
> When decoding committed transactions this is not an issue, and we
> never decode transactions that abort before the decoding starts.
> 
> But for in-progress transactions - for example when decoding prepared
> transactions on PREPARE (and not COMMIT PREPARED as before), this
> may cause failures when the output plugin consults catalogs (both
> system and user-defined).
> 
> We handle such failures by returning ERRCODE_TRANSACTION_ROLLBACK
> sqlerrcode from system table scan APIs to the backend decoding a
> specific uncommitted transaction. The decoding logic on the receipt
> of such an sqlerrcode aborts the ongoing decoding and returns
> gracefully.
> ---
>  src/backend/access/index/genam.c                | 31 +++++++++++++++++++++++++
>  src/backend/replication/logical/reorderbuffer.c | 30 ++++++++++++++++++++----
>  src/backend/utils/time/snapmgr.c                | 25 ++++++++++++++++++--
>  src/include/utils/snapmgr.h                     |  4 +++-
>  4 files changed, 82 insertions(+), 8 deletions(-)
> 
> diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
> index 9d08775687..67c5810bf7 100644
> --- a/src/backend/access/index/genam.c
> +++ b/src/backend/access/index/genam.c
> @@ -423,6 +423,16 @@ systable_getnext(SysScanDesc sysscan)
>      else
>          htup = heap_getnext(sysscan->scan, ForwardScanDirection);
>  
> +    /*
> +     * If CheckXidAlive is valid, then we check if it aborted. If it did, we
> +     * error out
> +     */
> +    if (TransactionIdIsValid(CheckXidAlive) &&
> +            TransactionIdDidAbort(CheckXidAlive))
> +            ereport(ERROR,
> +                (errcode(ERRCODE_TRANSACTION_ROLLBACK),
> +                 errmsg("transaction aborted during system catalog scan")));
> +
>      return htup;
>  }

Don't we have to check TransactionIdIsInProgress() first? C.f. header
comments in tqual.c.  Note this is also not guaranteed to be correct
after a crash (where no clog entry will exist for an aborted xact), but
we probably shouldn't get here in that case - but better be safe.

I suspect it'd be better reformulated as
  TransactionIdIsValid(CheckXidAlive) &&
  !TransactionIdIsInProgress(CheckXidAlive) &&
  !TransactionIdDidCommit(CheckXidAlive)

What do you think?


I think it'd also be good to add assertions to codepaths not going
through systable_* asserting that
!TransactionIdIsValid(CheckXidAlive). Alternatively we could add an
if (unlikely(TransactionIdIsValid(CheckXidAlive)) && ...)
branch to those too.



> From 80fc576bda483798919653991bef6dc198625d90 Mon Sep 17 00:00:00 2001
> From: Nikhil Sontakke <nikhils@2ndQuadrant.com>
> Date: Wed, 13 Jun 2018 16:31:15 +0530
> Subject: [PATCH 4/5] Teach test_decoding plugin to work with 2PC
> 
> Includes a new option "enable_twophase". Depending on this options
> value, PREPARE TRANSACTION will either be decoded or treated as
> a single phase commit later.

FWIW, I don't think I'm ok with doing this on a per-plugin-option basis.
I think this is something that should be known to the outside of the
plugin. More similar to how binary / non-binary support works.  Should
also be able to inquire the output plugin whether it's supported (cf
previous similarity).


> From 682b0de2827d1f55c4e471c3129eb687ae0825a5 Mon Sep 17 00:00:00 2001
> From: Nikhil Sontakke <nikhils@2ndQuadrant.com>
> Date: Wed, 13 Jun 2018 16:32:16 +0530
> Subject: [PATCH 5/5] Additional test case to demonstrate decoding/rollback
>  interlocking
> 
> Introduce a decode-delay parameter in the test_decoding plugin. Based
> on the value provided in the plugin, sleep for those many seconds while
> inside the "decode change" plugin call. A concurrent rollback is fired
> off which aborts that transaction in the meanwhile. A subsequent
> systable access will error out causing the logical decoding to abort.

Yea, I'm *definitely* still not on board with this. This'll just lead to
a fragile or extremely slow test.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: Locking B-tree leafs immediately in exclusive mode
Следующее
От: Dave Cramer
Дата:
Сообщение: why doesn't pg_create_logical_replication_slot throw an error if theencoder doesn't exist