RE: Perform streaming logical transactions by background workers and parallel apply
От | wangw.fnst@fujitsu.com |
---|---|
Тема | RE: Perform streaming logical transactions by background workers and parallel apply |
Дата | |
Msg-id | OS3PR01MB6275FBD9359F8ED0EDE7E5459EDE9@OS3PR01MB6275.jpnprd01.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: Perform streaming logical transactions by background workers and parallel apply (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
On Mon, May 30, 2022 7:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > Few comments/suggestions for 0001 and 0003 > ===================================== > 0001 > -------- Thanks for your comments. > 1. > + else > + snprintf(bgw.bgw_name, BGW_MAXLEN, > + "logical replication apply worker for subscription %u", subid); > > Can we slightly change the message to: "logical replication background > apply worker for subscription %u"? Improve the message as suggested. > 2. Can we think of separating the new logic for applying the xact by > bgworker into a new file like applybgwroker or applyparallel? We have > previously done the same in the case of vacuum (see vacuumparallel.c). Improve the patch as suggested. I separated the new logic related to apply background worker to new file src/backend/replication/logical/applybgwroker.c. > 3. > + /* > + * XXX The publisher side doesn't always send relation update messages > + * after the streaming transaction, so update the relation in main > + * apply worker here. > + */ > + if (action == LOGICAL_REP_MSG_RELATION) > + { > + LogicalRepRelation *rel = logicalrep_read_rel(s); > + logicalrep_relmap_update(rel); > + } > > I think the publisher side won't send the relation update message > after streaming transaction only if it has already been sent for a > non-streaming transaction in which case we don't need to update the > local cache here. This is as per my understanding of > maybe_send_schema(), do let me know if I am missing something? If my > understanding is correct then we don't need this change. I think we need this change because the publisher will invoke function cleanup_rel_sync_cache when committing a streaming transaction, then it will set "schema_sent" to true for related entry. Later, publisher may not send this schema in function maybe_send_schema because we already sent this schema (schema_sent = true). If we do not have this change, It would cause an error in the following case: Suppose that after walsender worker starts, first we commit a streaming transaction. Walsender sends relation update message, and only apply background worker can update relation map cache by this message. After this, if we commit a non-streamed transaction that contains same replicated table, walsender will not send relation update message, so main apply worker would not get relation update message. I think we need this change to update relation map cache not only in apply background worker but also in apply main worker. In addition, we should also handle the LOGICAL_REP_MSG_TYPE message just like LOGICAL_REP_MSG_RELATION. So improve the code you mentioned. BTW, I simplify the function handle_streamed_transaction(). > 4. > + * For the main apply worker, if in streaming mode (receiving a block of > + * streamed transaction), we send the data to the apply background worker. > * > - * If in streaming mode (receiving a block of streamed transaction), we > - * simply redirect it to a file for the proper toplevel transaction. > > This comment is slightly confusing. Can we change it to something > like: "In streaming case (receiving a block of streamed transaction), > for SUBSTREAM_ON mode, we simply redirect it to a file for the proper > toplevel transaction, and for SUBSTREAM_APPLY mode, we send the > changes to background apply worker."? Improve the comments as suggested. > 5. > +apply_handle_stream_abort(StringInfo s) > { > ... > ... > + /* > + * If the two XIDs are the same, it's in fact abort of toplevel xact, > + * so just free the subxactlist. > + */ > + if (subxid == xid) > + { > + set_apply_error_context_xact(subxid, InvalidXLogRecPtr); > > - fd = BufFileOpenFileSet(MyLogicalRepWorker->stream_fileset, path, > O_RDONLY, > - false); > + AbortCurrentTransaction(); > > - buffer = palloc(BLCKSZ); > + EndTransactionBlock(false); > + CommitTransactionCommand(); > + > + in_remote_transaction = false; > ... > ... > } > > Here, can we update the replication origin as we are doing in > apply_handle_rollback_prepared? Currently, we don't do it because we > are just cleaning up temporary files for which we don't even have a > transaction. Also, we don't have the required infrastructure to > advance origins for aborts as we have for abort prepared. See commits > [1eb6d6527a][8a812e5106]. If we think it is a good idea then I think > we need to send abort_lsn and abort_time from the publisher and we > need to be careful to make it work with lower subscriber versions that > don't have the facility to process these additional values. I think it is a good idea. I will consider this and add this part in next version. > 0003 > -------- > 6. > + /* > + * If any unique index exist, check that they are same as remoterel. > + */ > + if (!rel->sameunique) > + ereport(ERROR, > + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > + errmsg("cannot replicate relation with different unique index"), > + errhint("Please change the streaming option to 'on' instead of 'apply'."))); > > I think we can do better here. Instead of simply erroring out and > asking the user to change streaming mode, we can remember this in the > system catalog probably in pg_subscription, and then on restart, we > can change the streaming mode to 'on', perform the transaction, and > again change the streaming mode to apply. I am not sure whether we > want to do it in the first version or not, so if you agree with this, > developing it as a separate patch would be a good idea. > > Also, please update comments here as to why we don't handle such cases. Yes, I think it is a good idea. I will develop it as a separate patch later. And I added the comments atop function apply_bgworker_relation_check as below: ``` * Although we maintains the commit order by allowing only one process to * commit at a time, our access order to the relation has changed. * This could cause unexpected problems if the unique column on the replicated * table is inconsistent with the publisher-side or contains non-immutable * functions when applying transactions in the apply background worker. ``` I also made some other changes. The new patches and the modification details were attached in [1]. [1] - https://www.postgresql.org/message-id/OS3PR01MB62758A881FF3240171B7B21B9EDE9%40OS3PR01MB6275.jpnprd01.prod.outlook.com Regards, Wang wei
В списке pgsql-hackers по дате отправления:
Предыдущее
От: "wangw.fnst@fujitsu.com"Дата:
Сообщение: RE: Perform streaming logical transactions by background workers and parallel apply
Следующее
От: "wangw.fnst@fujitsu.com"Дата:
Сообщение: RE: Perform streaming logical transactions by background workers and parallel apply