Обсуждение: pgsql: Skip empty transaction stream in test_decoding.
Skip empty transaction stream in test_decoding. We were decoding empty transactions via streaming APIs added in commit 45fdc9738b even when the user used the option 'skip-empty-xacts'. The APIs makes no effort to skip empty xacts under the assumption that we will never try to stream such transactions. However, that is not true because we can pick to stream a transaction that has change messages for REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT and we don't send such messages to downstream rather they are just to update the internal state. So, we need to skip such xacts when plugin uses the option 'skip-empty-xacts'. Diagnosed-By: Amit Kapila Author: Dilip Kumar Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CAA4eK1+OqgFNZkf7=ETe_y5ntjgDk3T0wcdkd4Sot_u1hySGfw@mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/0ba5181c00eb0216bdfd9afbf3d680fee67d34b3 Modified Files -------------- contrib/test_decoding/Makefile | 2 +- .../test_decoding/expected/concurrent_stream.out | 19 ++++++++ contrib/test_decoding/expected/stream.out | 5 +- contrib/test_decoding/specs/concurrent_stream.spec | 37 +++++++++++++++ contrib/test_decoding/test_decoding.c | 55 +++++++++++++++------- 5 files changed, 95 insertions(+), 23 deletions(-)
Hi Amit, On Fri, Sep 11, 2020 at 04:43:59AM +0000, Amit Kapila wrote: > Skip empty transaction stream in test_decoding. > > We were decoding empty transactions via streaming APIs added in commit > 45fdc9738b even when the user used the option 'skip-empty-xacts'. The APIs > makes no effort to skip empty xacts under the assumption that we will > never try to stream such transactions. However, that is not true because > we can pick to stream a transaction that has change messages for > REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT and we don't send such messages to > downstream rather they are just to update the internal state. So, we need > to skip such xacts when plugin uses the option 'skip-empty-xacts'. Anole has reported the following failure in a test introduced by this commit: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2020-11-08%2017%3A49%3A13 Here is the diff: --- /home/pgbfarm/buildroot/HEAD/pgsql.build/contrib/test_decoding/expected/concurrent_stream.out 2020-09-11 04:59:46 -0600 +++ /home/pgbfarm/buildroot/HEAD/pgsql.build/contrib/test_decoding/output_iso/results/concurrent_stream.out 2020-11-08 12:31:10 -0700 @@ -13,7 +13,6 @@ opening a streamed block for transaction streaming change for transaction closing a streamed block for transaction -committing streamed transaction ?column? This smells like a race condition. -- Michael
Вложения
On Mon, Nov 9, 2020 at 7:11 AM Michael Paquier <michael@paquier.xyz> wrote: > > Hi Amit, > > On Fri, Sep 11, 2020 at 04:43:59AM +0000, Amit Kapila wrote: > > Skip empty transaction stream in test_decoding. > > > > We were decoding empty transactions via streaming APIs added in commit > > 45fdc9738b even when the user used the option 'skip-empty-xacts'. The APIs > > makes no effort to skip empty xacts under the assumption that we will > > never try to stream such transactions. However, that is not true because > > we can pick to stream a transaction that has change messages for > > REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT and we don't send such messages to > > downstream rather they are just to update the internal state. So, we need > > to skip such xacts when plugin uses the option 'skip-empty-xacts'. > > Anole has reported the following failure in a test introduced by this > commit: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2020-11-08%2017%3A49%3A13 > > Here is the diff: > --- > /home/pgbfarm/buildroot/HEAD/pgsql.build/contrib/test_decoding/expected/concurrent_stream.out > 2020-09-11 04:59:46 -0600 > +++ > /home/pgbfarm/buildroot/HEAD/pgsql.build/contrib/test_decoding/output_iso/results/concurrent_stream.out > 2020-11-08 12:31:10 -0700 > @@ -13,7 +13,6 @@ > opening a streamed block for transaction > streaming change for transaction > closing a streamed block for transaction > -committing streamed transaction > ?column? > > This smells like a race condition. > It is possible. I see that autovacuum has triggered (as per below logs) during this test which might have some impact on the output but I am not sure at this stage. I'll look into this. 2020-11-08 12:31:06.952 MST [7268:24] isolation/concurrent_stream DETAIL: parameters: $1 = '7270' 2020-11-08 12:31:06.930 MST [7270:7] isolation/concurrent_stream LOG: statement: COMMIT; 2020-11-08 12:31:07.109 MST [7276:2] LOG: automatic analyze of table "isolation_regression.pg_catalog.pg_attribute" system usage: CPU: user: 0.05 s, system: 0.01 s, elapsed: 0.47 s 2020-11-08 12:31:07.865 MST [7276:3] LOG: automatic vacuum of table "isolation_regression.pg_catalog.pg_class": index scans: 1 pages: 0 removed, 13 remain, 0 skipped due to pins, 0 skipped frozen tuples: 68 removed, 393 remain, 1 are dead but not yet removable, oldest xmin: 815 buffer usage: 89 hits, 2 misses, 10 dirtied avg read rate: 0.037 MB/s, avg write rate: 0.184 MB/s system usage: CPU: user: 0.00 s, system: 0.01 s, elapsed: 0.42 s WAL usage: 47 records, 9 full page images, 71981 bytes 2020-11-08 12:31:07.903 MST [7270:8] isolation/concurrent_stream LOG: statement: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL,NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1'); 2020-11-08 12:31:07.923 MST [7268:25] isolation/concurrent_stream LOG: execute isolationtester_waiting: SELECT pg_catalog.pg_isolation_test_session_is_blocked($1, '{7269,7270}') 2020-11-08 12:31:07.923 MST [7268:26] isolation/concurrent_stream DETAIL: parameters: $1 = '7270' -- With Regards, Amit Kapila.
On Mon, Nov 9, 2020 at 8:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Nov 9, 2020 at 7:11 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > /home/pgbfarm/buildroot/HEAD/pgsql.build/contrib/test_decoding/output_iso/results/concurrent_stream.out > > 2020-11-08 12:31:10 -0700 > > @@ -13,7 +13,6 @@ > > opening a streamed block for transaction > > streaming change for transaction > > closing a streamed block for transaction > > -committing streamed transaction > > ?column? > > > > This smells like a race condition. > > > > It is possible. I see that autovacuum has triggered (as per below > logs) during this test which might have some impact on the output but > I am not sure at this stage. I'll look into this. > I am able to reproduce this issue locally. This is an issue in test_decoding. The problem is that in a streaming transaction just before the last stream (which has no changes but a commit message) another concurrent empty transaction occurred (which in this case is caused by autovacuum). Now, in test_dcoding plugin, we track whether there is any change occured at the LogicalDecodingContext level (xact_wrote_changes) due to which the empty transaction in above context can toggle the flag and commit of streaming transaction will get confused. The first idea that occured to me to fix this issue is to keep track of whether we have decoded any changes at transaction level (ReorderBufferTxn) instead of keeping at LogicalDecodingContext level as streaming transactions can be interleaved with other transactions which was not the case previously. I'll think more on this and start a thread on pgsql-hackers. -- With Regards, Amit Kapila.
On Mon, Nov 9, 2020 at 10:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Nov 9, 2020 at 8:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Nov 9, 2020 at 7:11 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > > > /home/pgbfarm/buildroot/HEAD/pgsql.build/contrib/test_decoding/output_iso/results/concurrent_stream.out > > > 2020-11-08 12:31:10 -0700 > > > @@ -13,7 +13,6 @@ > > > opening a streamed block for transaction > > > streaming change for transaction > > > closing a streamed block for transaction > > > -committing streamed transaction > > > ?column? > > > > > > This smells like a race condition. > > > > > > > It is possible. I see that autovacuum has triggered (as per below > > logs) during this test which might have some impact on the output but > > I am not sure at this stage. I'll look into this. > > > > I am able to reproduce this issue locally. This is an issue in > test_decoding. The problem is that in a streaming transaction just > before the last stream (which has no changes but a commit message) > another concurrent empty transaction occurred (which in this case is > caused by autovacuum). Now, in test_dcoding plugin, we track whether > there is any change occured at the LogicalDecodingContext level > (xact_wrote_changes) due to which the empty transaction in above > context can toggle the flag and commit of streaming transaction will > get confused. The first idea that occured to me to fix this issue is > to keep track of whether we have decoded any changes at transaction > level (ReorderBufferTxn) instead of keeping at LogicalDecodingContext > level as streaming transactions can be interleaved with other > transactions which was not the case previously. I'll think more on > this and start a thread on pgsql-hackers. The analysis seems correct to me, basically with streaming the actual streaming messages and the commit can be interleaved so ideally, we should maintain xact_wrote_changes, within the transaction's context i.e. ReorderBufferTxn. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com