Re: Minimal logical decoding on standbys
От | Amit Khandekar |
---|---|
Тема | Re: Minimal logical decoding on standbys |
Дата | |
Msg-id | CAJ3gD9f+aFrq_fOuUqdtjapHZKP8V8feXZVE2u4PiEp=32BkEw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Minimal logical decoding on standbys (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Minimal logical decoding on standbys
(Amit Khandekar <amitdkhan.pg@gmail.com>)
|
Список | pgsql-hackers |
On Sat, 6 Apr 2019 at 04:45, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > Thanks for the new version of the patch. Btw, could you add Craig as a > co-author in the commit message of the next version of the patch? Don't > want to forget him. I had put his name in the earlier patch. But now I have made it easier to spot. > > On 2019-04-05 17:08:39 +0530, Amit Khandekar wrote: > > Regarding the test result failures, I could see that when we drop a > > logical replication slot at standby server, then the catalog_xmin of > > physical replication slot becomes NULL, whereas the test expects it to > > be equal to xmin; and that's the reason a couple of test scenarios are > > failing : > > > > ok 33 - slot on standby dropped manually > > Waiting for replication conn replica's replay_lsn to pass '0/31273E0' on master > > done > > not ok 34 - physical catalog_xmin still non-null > > not ok 35 - xmin and catalog_xmin equal after slot drop > > # Failed test 'xmin and catalog_xmin equal after slot drop' > > # at t/016_logical_decoding_on_replica.pl line 272. > > # got: > > # expected: 2584 > > > > I am not sure what is expected. What actually happens is : the > > physical xlot catalog_xmin remains NULL initially, but becomes > > non-NULL after the logical replication slot is created on standby. > > That seems like the correct behaviour to me - why would we still have a > catalog xmin if there's no slot logical slot? Yeah ... In the earlier implementation, maybe it was different, that's why the catalog_xmin didn't become NULL. Not sure. Anyways, I have changed this check. Details in the following sections. > > > > diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c > > index 006446b..5785d2f 100644 > > --- a/src/backend/replication/slot.c > > +++ b/src/backend/replication/slot.c > > @@ -1064,6 +1064,85 @@ ReplicationSlotReserveWal(void) > > } > > } > > > > +void > > +ResolveRecoveryConflictWithSlots(Oid dboid, TransactionId xid) > > +{ > > + int i; > > + bool found_conflict = false; > > + > > + if (max_replication_slots <= 0) > > + return; > > + > > +restart: > > + if (found_conflict) > > + { > > + CHECK_FOR_INTERRUPTS(); > > + /* > > + * Wait awhile for them to die so that we avoid flooding an > > + * unresponsive backend when system is heavily loaded. > > + */ > > + pg_usleep(100000); > > + found_conflict = false; > > + } > > + > > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > > + for (i = 0; i < max_replication_slots; i++) > > + { > > + ReplicationSlot *s; > > + NameData slotname; > > + TransactionId slot_xmin; > > + TransactionId slot_catalog_xmin; > > + > > + s = &ReplicationSlotCtl->replication_slots[i]; > > + > > + /* cannot change while ReplicationSlotCtlLock is held */ > > + if (!s->in_use) > > + continue; > > + > > + /* not our database, skip */ > > + if (s->data.database != InvalidOid && s->data.database != dboid) > > + continue; > > + > > + SpinLockAcquire(&s->mutex); > > + slotname = s->data.name; > > + slot_xmin = s->data.xmin; > > + slot_catalog_xmin = s->data.catalog_xmin; > > + SpinLockRelease(&s->mutex); > > + > > + if (TransactionIdIsValid(slot_xmin) && TransactionIdPrecedesOrEquals(slot_xmin, xid)) > > + { > > + found_conflict = true; > > + > > + ereport(WARNING, > > + (errmsg("slot %s w/ xmin %u conflicts with removed xid %u", > > + NameStr(slotname), slot_xmin, xid))); > > + } > > + > > + if (TransactionIdIsValid(slot_catalog_xmin) && TransactionIdPrecedesOrEquals(slot_catalog_xmin, xid)) > > + { > > + found_conflict = true; > > + > > + ereport(WARNING, > > + (errmsg("slot %s w/ catalog xmin %u conflicts with removed xid %u", > > + NameStr(slotname), slot_catalog_xmin, xid))); > > + } > > + > > + > > + if (found_conflict) > > + { > > + elog(WARNING, "Dropping conflicting slot %s", s->data.name.data); > > + LWLockRelease(ReplicationSlotControlLock); /* avoid deadlock */ > > + ReplicationSlotDropPtr(s); > > + > > + /* We released the lock above; so re-scan the slots. */ > > + goto restart; > > + } > > + } > > > I think this should be refactored so that the two found_conflict cases > set a 'reason' variable (perhaps an enum?) to the particular reason, and > then only one warning should be emitted. I also think that LOG might be > more appropriate than WARNING - as confusing as that is, LOG is more > severe than WARNING (see docs about log_min_messages). What I have in mind is : ereport(LOG, (errcode(ERRCODE_INTERNAL_ERROR), errmsg("Dropping conflicting slot %s", s->data.name.data), errdetail("%s, removed xid %d.", conflict_str, xid))); where conflict_str is a dynamically generated string containing something like : "slot xmin : 1234, slot catalog_xmin: 5678" So for the user, the errdetail will look like : "slot xmin: 1234, catalog_xmin: 5678, removed xid : 9012" I think the user can figure out whether it was xmin or catalog_xmin or both that conflicted with removed xid. If we don't do this way, we may not be able to show in a single message if both xmin and catalog_xmin are conflicting at the same time. Does this message look good to you, or you had in mind something quite different ? > > > > @@ -0,0 +1,386 @@ > > +# Demonstrate that logical can follow timeline switches. > > +# > > +# Test logical decoding on a standby. > > +# > > +use strict; > > +use warnings; > > +use 5.8.0; > > + > > +use PostgresNode; > > +use TestLib; > > +use Test::More tests => 55; > > +use RecursiveCopy; > > +use File::Copy; > > + > > +my ($stdin, $stdout, $stderr, $ret, $handle, $return); > > +my $backup_name; > > + > > +# Initialize master node > > +my $node_master = get_new_node('master'); > > +$node_master->init(allows_streaming => 1, has_archiving => 1); > > +$node_master->append_conf('postgresql.conf', q{ > > +wal_level = 'logical' > > +max_replication_slots = 4 > > +max_wal_senders = 4 > > +log_min_messages = 'debug2' > > +log_error_verbosity = verbose > > +# send status rapidly so we promptly advance xmin on master > > +wal_receiver_status_interval = 1 > > +# very promptly terminate conflicting backends > > +max_standby_streaming_delay = '2s' > > +}); > > +$node_master->dump_info; > > +$node_master->start; > > + > > +$node_master->psql('postgres', q[CREATE DATABASE testdb]); > > + > > +$node_master->safe_psql('testdb', q[SELECT * FROM pg_create_physical_replication_slot('decoding_standby');]); > > +$backup_name = 'b1'; > > +my $backup_dir = $node_master->backup_dir . "/" . $backup_name; > > +TestLib::system_or_bail('pg_basebackup', '-D', $backup_dir, '-d', $node_master->connstr('testdb'), '--slot=decoding_standby'); > > + > > +sub print_phys_xmin > > +{ > > + my $slot = $node_master->slot('decoding_standby'); > > + return ($slot->{'xmin'}, $slot->{'catalog_xmin'}); > > +} > > + > > +my ($xmin, $catalog_xmin) = print_phys_xmin(); > > +# After slot creation, xmins must be null > > +is($xmin, '', "xmin null"); > > +is($catalog_xmin, '', "catalog_xmin null"); > > + > > +my $node_replica = get_new_node('replica'); > > +$node_replica->init_from_backup( > > + $node_master, $backup_name, > > + has_streaming => 1, > > + has_restoring => 1); > > +$node_replica->append_conf('postgresql.conf', > > + q[primary_slot_name = 'decoding_standby']); > > + > > +$node_replica->start; > > +$node_master->wait_for_catchup($node_replica, 'replay', $node_master->lsn('flush')); > > + > > +# with hot_standby_feedback off, xmin and catalog_xmin must still be null > > +($xmin, $catalog_xmin) = print_phys_xmin(); > > +is($xmin, '', "xmin null after replica join"); > > +is($catalog_xmin, '', "catalog_xmin null after replica join"); > > + > > +$node_replica->append_conf('postgresql.conf',q[ > > +hot_standby_feedback = on > > +]); > > +$node_replica->restart; > > +sleep(2); # ensure walreceiver feedback sent > > Can we make this more robust? E.g. by waiting till pg_stat_replication > shows the change on the primary? Because I can guarantee that this'll > fail on slow buildfarm machines (say the valgrind animals). > > > > > > +$node_master->wait_for_catchup($node_replica, 'replay', $node_master->lsn('flush')); > > +sleep(2); # ensure walreceiver feedback sent > > Similar. Ok. I have put a copy of the get_slot_xmins() function from t/001_stream_rep.pl() into 016_logical_decoding_on_replica.pl. Renamed it to wait_for_phys_mins(). And used this to wait for the hot_standby_feedback change to propagate to master. This function waits for the physical slot's xmin and catalog_xmin to get the right values depending on whether there is a logical slot in standby and whether hot_standby_feedback is on on standby. I was not sure how pg_stat_replication could be used to identify about hot_standby_feedback change reaching to master. So i did the above way, which I think pretty much does what we want, I think. Attached v4 patch only has the testcase change, and some minor cleanup in the test file. -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
Вложения
В списке pgsql-hackers по дате отправления:
Следующее
От: Konstantin KnizhnikДата:
Сообщение: Re: Zedstore - compressed in-core columnar storage