Re: [HACKERS] logical decoding of two-phase transactions

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [HACKERS] logical decoding of two-phase transactions
Дата
Msg-id CAD21AoAWq1Si+fb3XAd7+==-ERGva861tX8+4LO9fP3wHEBCFg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] logical decoding of two-phase transactions  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] logical decoding of two-phase transactions  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Mon, Dec 14, 2020 at 6:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Dec 8, 2020 at 2:01 PM Ajin Cherian <itsajin@gmail.com> wrote:
> >
> > On Tue, Dec 1, 2020 at 6:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > To skip it, we need to send GID in begin message and then on
> > > subscriber-side, check if the prepared xact already exists, if so then
> > > set a flag. The flag needs to be set in begin/start_stream and reset
> > > in stop_stream/commit/abort. Using the flag, we can skip the entire
> > > contents of the prepared xact. In ReorderFuffer-side also, we need to
> > > get and set GID in txn even when we skip it because we need to send
> > > the same at commit time. In this solution, we won't be able to send it
> > > during normal start_stream because by that time we won't know GID and
> > > I think that won't be required. Note that this is only required when
> > > we skipped sending prepare, otherwise, we just need to send
> > > Commit-Prepared at commit time.
> >
> > I have implemented these changes and tested the fix using the test
> > setup I had shared above and it seems to be working fine.
> > I have also tested restarts that simulate duplicate prepares being
> > sent by the publisher and verified that it is handled correctly by the
> > subscriber.
> >
>
> This implementation has a flaw in that it has used commit_lsn for the
> prepare when we are sending prepare just before commit prepared. We
> can't send the commit LSN with prepare because if the subscriber
> crashes after prepare then it would update
> replorigin_session_origin_lsn with that commit_lsn. Now, after the
> restart, because we will use that LSN to start decoding, the Commit
> Prepared will get skipped. To fix this, we need to remember the
> prepare LSN and other information even when we skip prepare and then
> use it while sending the prepare during commit prepared.
>
> Now, after fixing this, I discovered another issue which is that we
> allow adding a new snapshot to prepared transactions via
> SnapBuildDistributeNewCatalogSnapshot. We can only allow it to get
> added to in-progress transactions. If you comment out the changes
> added in SnapBuildDistributeNewCatalogSnapshot then you will notice
> one test failure which indicates this problem. This problem was not
> evident before the bug-fix in the previous paragraph because you were
> using commit-lsn even for the prepare and newly added snapshot change
> appears to be before the end_lsn.
>
> Some other assorted changes in various patches:
> v31-0001-Extend-the-output-plugin-API-to-allow-decoding-o
> 1. I have changed the filter_prepare API to match the signature with
> FilterByOrigin. I don't see the need for ReorderBufferTxn or xid in
> the API.
> 2. I have expanded the documentation of 'Begin Prepare Callback' to
> explain how a user can use it to detect already prepared transactions
> and in which scenarios that can happen.
> 3. Added a few comments in the code atop begin_prepare_cb_wrapper to
> explain why we are adding this new API.
> 4. Move the check whether the filter_prepare callback is defined from
> filter_prepare_cb_wrapper to caller. This is similar to how
> FilterByOrigin works.
> 5. Fixed various whitespace and cosmetic issues.
> 6. Update commit message to include two of the newly added APIs
>
> v31-0002-Allow-decoding-at-prepare-time-in-ReorderBuffer
> 1. Changed the variable names and comments in DecodeXactOp.
> 2. A new API for FilterPrepare similar to FilterByOrigin and use that
> instead of ReorderBufferPrepareNeedSkip.
> 3. In DecodeCommit, we need to update the reorderbuffer about the
> surviving subtransactions for both ReorderBufferFinishPrepared and
> ReorderBufferCommit because now both can process the transaction.
> 4. Because, now we need to remember the prepare info even when we skip
> it, I have simplified ReorderBufferPrepare API by removing the extra
> parameters as that information will be now available via
> ReorderBufferTxn.
> 5. Updated comments at various places.
>
> v31-0006-Support-2PC-txn-pgoutput
> 1. Added Asserts in streaming APIs on the subscriber-side to ensure
> that we should never reach there for the already prepared transaction
> case. We never need to stream the changes when we are skipping prepare
> either because the snapshot was not consistent by that time or we have
> already sent those changes before restart. Added the same Assert in
> Begin and Commit routines because while skipping prepared txn, we must
> not receive the changes of any other xact.
> 2.
> + /*
> + * Flags are determined from the state of the transaction. We know we
> + * always get PREPARE first and then [COMMIT|ROLLBACK] PREPARED, so if
> + * it's already marked as committed then it has to be COMMIT PREPARED (and
> + * likewise for abort / ROLLBACK PREPARED).
> + */
> + if (rbtxn_commit_prepared(txn))
> + flags = LOGICALREP_IS_COMMIT_PREPARED;
> + else if (rbtxn_rollback_prepared(txn))
> + flags = LOGICALREP_IS_ROLLBACK_PREPARED;
> + else
> + flags = LOGICALREP_IS_PREPARE;
>
> I don't like clubbing three different operations under one message
> LOGICAL_REP_MSG_PREPARE. It looks awkward to use new flags
> RBTXN_COMMIT_PREPARED and RBTXN_ROLLBACK_PREPARED in ReordeBuffer so
> that we can recognize these operations in corresponding callbacks. I
> think setting any flag in ReorderBuffer should not dictate the
> behavior in callbacks. Then also there are few things that are not
> common to those APIs like the patch has an Assert to say that the txn
> is marked with prepare flag for all three operations which I think is
> not true for Rollback Prepared after the restart. We don't ensure to
> set the Prepare flag if the Rollback Prepare happens after the
> restart. Then, we have to introduce separate flags to distinguish
> prepare/commit prepared/rollback prepared to distinguish multiple
> operations sent as protocol messages. Also, all these operations are
> mutually exclusive so it will be better to send separate messages for
> each of these and I have changed it accordingly in the attached patch.
>
> 3. The patch has a duplicate code to send replication origins. I have
> moved the common code to a separate function.
>
> v31-0009-Support-2PC-txn-Subscription-option
> 1.
> --- a/src/include/catalog/catversion.h
> +++ b/src/include/catalog/catversion.h
> @@ -53,6 +53,6 @@
>   */
>
>  /* yyyymmddN */
> -#define CATALOG_VERSION_NO 202011251
> +#define CATALOG_VERSION_NO 202011271
>
> No need to change catversion as this gets changed frequently and that
> leads to conflict in the patch. We can change it either in the final
> version or normally committers take care of this. If you want to
> remember it, maybe adding a line for it in the commit message should
> be okay. For now, I have removed this from the patch.
>

Thank you for updating the patch. I have two questions:

-----
@@ -239,6 +239,19 @@ CREATE SUBSCRIPTION <replaceable
class="parameter">subscription_name</replaceabl
          </para>
         </listitem>
        </varlistentry>
+       <varlistentry>
+        <term><literal>two_phase</literal> (<type>boolean</type>)</term>
+        <listitem>
+         <para>
+          Specifies whether two-phase commit is enabled for this subscription.
+          The default is <literal>false</literal>.
+          When two-phase commit is enabled then the decoded
transactions are sent
+          to the subscriber on the PREPARE TRANSACTION. When
two-phase commit is not
+          enabled then PREPARE TRANSACTION and COMMIT/ROLLBACK PREPARED are not
+          decoded on the publisher.
+         </para>
+        </listitem>
+       </varlistentry>

The user will need to specify the 'two_phase’ option on CREATE
SUBSCRIPTION. It would mean the user will need to control what data is
streamed both on publication side for INSERT/UPDATE/DELETE/TRUNCATE
and on subscriber side for PREPARE. Looking at the implementation of
the ’two_phase’ option of CREATE SUBSCRIPTION, it ultimately just
passes the ‘two_phase' option to the publisher. Why don’t we set it on
the publisher side? Also, I guess we can improve the description of
’two_phase’ option of CREATE SUBSCRIPTION in the doc by adding the
fact that when this option is not enabled the transaction prepared on
the publisher is decoded as a normal transaction:

------
+   if (LookupGXact(begin_data.gid))
+   {
+       /*
+        * If this gid has already been prepared then we dont want to apply
+        * this txn again. This can happen after restart where upstream can
+        * send the prepared transaction again. See
+        * ReorderBufferFinishPrepared. Don't update remote_final_lsn.
+        */
+       skip_prepared_txn = true;
+       return;
+   }

When PREPARE arrives at the subscriber node but there is the prepared
transaction with the same transaction identifier, the apply worker
skips the whole transaction. So if the users prepared a transaction
with the same identifier on the subscriber, the prepared transaction
that came from the publisher would be ignored without any messages. On
the other hand, if applying other operations such as HEAP_INSERT
conflicts (such as when violating the unique constraint) the apply
worker raises an ERROR and stops logical replication until the
conflict is resolved. IIUC since we can know that the prepared
transaction came from the same publisher again by checking origin_lsn
in TwoPhaseFileHeader I guess we can skip the PREPARE message only
when the existing prepared transaction has the same LSN and the same
identifier. To be exact, it’s still possible that the subscriber gets
two PREPARE messages having the same LSN and name from two different
publishers but it’s unlikely happen in practice.

Regards,

--
Masahiko Sawada
EnterpriseDB:  https://www.enterprisedb.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module