Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
От | Peter Smith |
---|---|
Тема | Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop |
Дата | |
Msg-id | CAHut+Ps2W7KP46zFQ9ybEckJoRKpT6zh6aPj31L2RV9qBcoLRg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
(Amit Kapila <amit.kapila16@gmail.com>)
|
Список | pgsql-bugs |
On Mon, Nov 23, 2020 at 8:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > Peter, can you also please once test the attached and see if this > fixes the problem for you as well? I have reviewed the v3 patch code, and repeated the tablesync testing (for a stream-enabled SUBSCRIPTION) using the debugging technique previously described [1]. https://www.postgresql.org/message-id/CAHut%2BPt4PyKQCwqzQ%3DEFF%3DbpKKJD7XKt_S23F6L20ayQNxg77A%40mail.gmail.com Now the test was successful. The stream start/stop/commit/abort are all being handled by the tablesync worker without errors. === TESTING I performed the tablesync stream test by multiple different ways of inserting the en masse data: 1. Cause streaming (using a PREPARE TRANSACTION) ------------------------------------------------ psql -d test_pub -c "BEGIN;INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5000) s(i);PREPARE TRANSACTION 'test_prepared_tab';" psql -d test_pub -c "COMMIT PREPARED 'test_prepared_tab';" tablesync does 'S' (stream start) tablesync does 'R' tablesync does 'I' (insert) x n tablesync does 'E' (stream end) tablesync does 'S' tablesync does 'I' x n tablesync does 'E' tablesync does { 'S', 'I' x n, 'E' } repeats 13 more times tablesync does 'c' (stream commit) - In handle_stream_commit the tablesync worker processes the spool file, calling apply_dispatch for all the messages - 'R' - { 'I' + should_apply_changes_for_rel = true } x 5000 - then calls process_syncing_tables_for_sync, which sets the state to SUBREL_STATE_SYNCDONE - then tablesync worker process exits Now the "apply" worker catches up. - it runs all the same messages again but does nothing because should_apply_changes_for_rel = false 2. Cause streaming (not using a PREPARE TRANSACTION) ---------------------------------------------------- psql -d test_pub -c "INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5000) s(i);" gives very similar results to above 3. Causing a stream abort to happen ----------------------------------- psql -d test_pub -c "BEGIN; INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5000) s(i); INSERT INTO test_tab VALUES(4999, 'already exists'); END;" tablesync does 'S' (stream start) tablesync does 'E' (stream end) tablesync does 'A' (stream abort) tablesync does 'B' tablesync does 'C' - calls process_syncing_tables - tablesync worker process exits apply worker continues... === REVIEW COMMENTS Despite the tests working OK I thought there should be a couple small changes made for this patch, as follows. (1) process_sync_tables should be last IMO the process_syncing_tables should be always the *last* thing any apply handler does, because it seems a bad idea for the worker to change state (e.g. to say it is SUBREL_STATE_SYNCDONE) if in fact there is still a little bit more processing remaining. This is why I made the same modification in the other WIP patch v26-0006 [2] [2] https://www.postgresql.org/message-id/CAHut%2BPuQcLbjrBXM54%2By%2B%3DYsmEDFd3CCtp31K-_jS4Xka2vwbQ%40mail.gmail.com So, I think the process_syncing_tables() call should be put *after* the stream_cleanup_files() in function apply_handle_stream_commit(). But to do this means the process_syncing_tables has to be removed from your new apply_handle_commit_internal function. (2) misleading comment /* * Start a transaction on stream start, this transaction will be committed * on the stream stop. We need the transaction for handling the buffile, * used for serializing the streaming data and subxact info. */ That above comment (in the apply_handle_stream_start function) may now be inaccurate/misleading for the case of tablesync worker, because I think for tablesync you are explicitly avoiding the tx commit within the apply_handle_stream_stop(). --- Kind Regards, Peter Smith. Fujitsu Australia
В списке pgsql-bugs по дате отправления:
Предыдущее
От: Manoj KumarДата:
Сообщение: Re: BUG #16739: Temporary files not deleting from data folder on disk
Следующее
От: David TuroňДата:
Сообщение: Re: BUG #16743: psql doesn't show whole expression in stored column