Re: Minimal logical decoding on standbys

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Minimal logical decoding on standbys
Дата
Msg-id 20190402160414.h4wwhetcpggmy3tv@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Minimal logical decoding on standbys  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Ответы Re: Minimal logical decoding on standbys  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Список pgsql-hackers
Hi,

On 2019-04-02 15:26:52 +0530, Amit Khandekar wrote:
> On Thu, 14 Mar 2019 at 15:00, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> > I managed to get a recovery conflict by :
> > 1. Setting hot_standby_feedback to off
> > 2. Creating a logical replication slot on standby
> > 3. Creating a table on master, and insert some data.
> > 2. Running : VACUUM FULL;
> >
> > This gives WARNING messages in the standby log file.
> > 2019-03-14 14:57:56.833 IST [40076] WARNING:  slot decoding_standby w/
> > catalog xmin 474 conflicts with removed xid 477
> > 2019-03-14 14:57:56.833 IST [40076] CONTEXT:  WAL redo at 0/3069E98
> > for Heap2/CLEAN: remxid 477
> >
> > But I did not add such a testcase into the test file, because with the
> > current patch, it does not do anything with the slot; it just keeps on
> > emitting WARNING in the log file; so we can't test this scenario as of
> > now using the tap test.
> 
> I am going ahead with drop-the-slot way of handling the recovery
> conflict. I am trying out using ReplicationSlotDropPtr() to drop the
> slot. It seems the required locks are already in place inside the for
> loop of ResolveRecoveryConflictWithSlots(), so we can directly call
> ReplicationSlotDropPtr() when the slot xmin conflict is found.

Cool.


> As explained above, the only way I could reproduce the conflict is by
> turning hot_standby_feedback off on slave, creating and inserting into
> a table on master and then running VACUUM FULL. But after doing this,
> I am not able to verify whether the slot is dropped, because on slave,
> any simple psql command thereon, waits on a lock acquired on sys
> catache, e.g. pg_authid. Working on it.

I think that indicates a bug somewhere. If replay progressed, it should
have killed the slot, and continued replaying past the VACUUM
FULL. Those symptoms suggest replay is stuck somewhere. I suggest a)
compiling with WAL_DEBUG enabled, and turning on wal_debug=1, b) looking
at a backtrace of the startup process.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Column lookup in a row performance
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: [PATCH v22] GSSAPI encryption support