Re: logical replication empty transactions
От | Peter Smith |
---|---|
Тема | Re: logical replication empty transactions |
Дата | |
Msg-id | CAHut+PvVC8=qhxuhk3AWsK6Oo_hgLfvBiPQXJw=04acEcKFVFg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: logical replication empty transactions (Ajin Cherian <itsajin@gmail.com>) |
Список | pgsql-hackers |
On Fri, Mar 4, 2022 at 12:41 PM Ajin Cherian <itsajin@gmail.com> wrote: > > I have split the patch into two. I have kept the logic of skipping > streaming changes in the second patch. > I will work on the second patch once we can figure out a solution for > the COMMIT PREPARED after restart problem. > Please see below my review comments for the first patch only (v23-0001) ====== 1. Patch failed to apply cleanly - whitespace warnings. git apply ../patches_misc/v23-0001-Skip-empty-transactions-for-logical-replication.patch ../patches_misc/v23-0001-Skip-empty-transactions-for-logical-replication.patch:68: trailing whitespace. * change in a transaction is processed. This makes it possible warning: 1 line adds whitespace errors. ~~~ 2. src/backend/replication/pgoutput/pgoutput.c - typedef struct PGOutputTxnData +/* + * Maintain a per-transaction level variable to track whether the + * transaction has sent BEGIN. BEGIN is only sent when the first + * change in a transaction is processed. This makes it possible + * to skip transactions that are empty. + */ +typedef struct PGOutputTxnData I felt that this comment is describing details all about its bool member but I think maybe it should be describing something also about the structure itself (because this is the structure comment). E.g. it should say about it only being allocated by the pgoutput_begin_txn() and it is accessible via txn->output_plugin_private. Maybe also say this has subtle implications if this is NULL then it means the tx can't be 2PC etc... ~~~ 3. src/backend/replication/pgoutput/pgoutput.c - pgoutput_send_begin +/* + * Send BEGIN. + * + * This is where the BEGIN is actually sent. This is called while processing + * the first change of the transaction. + */ +static void +pgoutput_send_begin(LogicalDecodingContext *ctx, ReorderBufferTXN *txn) IMO there is no need to repeat "This is where the BEGIN is actually sent.", because "Send BEGIN." already said the same thing :-) ~~~ 4. src/backend/replication/pgoutput/pgoutput.c - pgoutput_commit_txn + /* + * If a BEGIN message was not yet sent, then it means there were no relevant + * changes encountered, so we can skip the COMMIT message too. + */ + sent_begin_txn = txndata->sent_begin_txn; + txn->output_plugin_private = NULL; + OutputPluginUpdateProgress(ctx, !sent_begin_txn); + + pfree(txndata); Not quite sure why this pfree is positioned where it is (after that function call). I felt this should be a couple of lines up so txndata is freed as soon as you had no more use for it (i.e. after you copied the bool from it) ~~~ 5. src/backend/replication/pgoutput/pgoutput.c - maybe_send_schema @@ -594,6 +658,13 @@ maybe_send_schema(LogicalDecodingContext *ctx, if (schema_sent) return; + /* set up txndata */ + txndata = toptxn->output_plugin_private; The comment does quite feel right. Nothing is "setting up" anything. Really, all this does is assign a reference to the tx private data. Probably better with no comment at all? ~~~ 6. src/backend/replication/pgoutput/pgoutput.c - maybe_send_schema I observed that every call to the maybe_send_schema function also has adjacent code that already/always is checking to call pgoutput_send_begin_tx function. So then I am wondering is the added logic to the maybe_send_schema even needed at all? It looks a bit redundant. Thoughts? ~~~ 7. src/backend/replication/pgoutput/pgoutput.c - pgoutput_change @@ -1141,6 +1212,7 @@ pgoutput_change(LogicalDecodingContext *ctx, ReorderBufferTXN *txn, Relation relation, ReorderBufferChange *change) { PGOutputData *data = (PGOutputData *) ctx->output_plugin_private; + PGOutputTxnData *txndata = (PGOutputTxnData *) txn->output_plugin_private; MemoryContext old; Maybe if is worth deferring this assignment until after the row-filter check. Otherwise, you are maybe doing it for nothing and IIRC this is hot code so the less you do here the better. OTOH a single assignment probably amounts to almost nothing. ~~~ 8. src/backend/replication/pgoutput/pgoutput.c - pgoutput_change @@ -1354,6 +1438,7 @@ pgoutput_truncate(LogicalDecodingContext *ctx, ReorderBufferTXN *txn, int nrelations, Relation relations[], ReorderBufferChange *change) { PGOutputData *data = (PGOutputData *) ctx->output_plugin_private; + PGOutputTxnData *txndata; MemoryContext old; This variable declaration should be done later in the block where it is assigned. ~~~ 9. src/backend/replication/pgoutput/pgoutput.c - suggestion I notice there is quite a few places in the patch that look like: + txndata = (PGOutputTxnData *) txn->output_plugin_private; + + /* Send BEGIN if we haven't yet */ + if (txndata && !txndata->sent_begin_txn) + pgoutput_send_begin(ctx, txn); + It might be worth considering encapsulating all those in a helper function like: pgoutput_maybe_send_begin(ctx, txn) It would certainly be a lot tidier. ~~~ 10. src/backend/replication/syncrep.c - SyncRepEnabled @@ -539,6 +538,15 @@ SyncRepReleaseWaiters(void) } /* + * Check if synchronous replication is enabled + */ +bool +SyncRepEnabled(void) Missing period for that function comment. ------ Kind Regards, Peter Smith. Fujitsu Australia
В списке pgsql-hackers по дате отправления:
Следующее
От: Kyotaro HoriguchiДата:
Сообщение: Re: pg_tablespace_location() failure with allow_in_place_tablespaces