Re: Minimal logical decoding on standbys

Поиск
Список
Период
Сортировка
От Amit Khandekar
Тема Re: Minimal logical decoding on standbys
Дата
Msg-id CAJ3gD9fj2CZOmGzZBtF8vcz0uLwaFZ9WXEyRPkmjjd0goojMxQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Minimal logical decoding on standbys  (Andres Freund <andres@anarazel.de>)
Ответы Re: Minimal logical decoding on standbys  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Wed, 10 Apr 2019 at 21:39, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2019-04-10 12:11:21 +0530, tushar wrote:
> >
> > On 03/13/2019 08:40 PM, tushar wrote:
> > > Hi ,
> > >
> > > I am getting a server crash on standby while executing
> > > pg_logical_slot_get_changes function   , please refer this scenario
> > >
> > > Master cluster( ./initdb -D master)
> > > set wal_level='hot_standby in master/postgresql.conf file
> > > start the server , connect to  psql terminal and create a physical
> > > replication slot ( SELECT * from
> > > pg_create_physical_replication_slot('p1');)
> > >
> > > perform pg_basebackup using --slot 'p1'  (./pg_basebackup -D slave/ -R
> > > --slot p1 -v))
> > > set wal_level='logical' , hot_standby_feedback=on,
> > > primary_slot_name='p1' in slave/postgresql.conf file
> > > start the server , connect to psql terminal and create a logical
> > > replication slot (  SELECT * from
> > > pg_create_logical_replication_slot('t','test_decoding');)
> > >
> > > run pgbench ( ./pgbench -i -s 10 postgres) on master and select
> > > pg_logical_slot_get_changes on Slave database
> > >
> > > postgres=# select * from pg_logical_slot_get_changes('t',null,null);
> > > 2019-03-13 20:34:50.274 IST [26817] LOG:  starting logical decoding for
> > > slot "t"
> > > 2019-03-13 20:34:50.274 IST [26817] DETAIL:  Streaming transactions
> > > committing after 0/6C000060, reading WAL from 0/6C000028.
> > > 2019-03-13 20:34:50.274 IST [26817] STATEMENT:  select * from
> > > pg_logical_slot_get_changes('t',null,null);
> > > 2019-03-13 20:34:50.275 IST [26817] LOG:  logical decoding found
> > > consistent point at 0/6C000028
> > > 2019-03-13 20:34:50.275 IST [26817] DETAIL:  There are no running
> > > transactions.
> > > 2019-03-13 20:34:50.275 IST [26817] STATEMENT:  select * from
> > > pg_logical_slot_get_changes('t',null,null);
> > > TRAP: FailedAssertion("!(data == tupledata + tuplelen)", File:
> > > "decode.c", Line: 977)
> > > server closed the connection unexpectedly
> > >     This probably means the server terminated abnormally
> > >     before or while processing the request.
> > > The connection to the server was lost. Attempting reset: 2019-03-13
> > > 20:34:50.276 IST [26809] LOG:  server process (PID 26817) was terminated
> > > by signal 6: Aborted
> > >
> > Andres - Do you think - this is an issue which needs to  be fixed ?
>
> Yes, it definitely needs to be fixed. I just haven't had sufficient time
> to look into it. Have you reproduced this with Amit's latest version?
>
> Amit, have you spent any time looking into it? I know that you're not
> that deeply steeped into the internals of logical decoding, but perhaps
> there's something obvious going on.

I tried to see if I can quickly understand what's going on.

Here, master wal_level is hot_standby, not logical, though slave
wal_level is logical.

On slave, when pg_logical_slot_get_changes() is run, in
DecodeMultiInsert(), it does not get any WAL records having
XLH_INSERT_CONTAINS_NEW_TUPLE set. So data pointer is never
incremented, it remains at tupledata. So at the end of the function,
this assertion fails :
Assert(data == tupledata + tuplelen);
because data is actually at tupledata.

Not sure why this is happening. On slave, wal_level is logical, so
logical records should have tuple data. Not sure what does that have
to do with wal_level of master. Everything should be there on slave
after it replays the inserts; and also slave wal_level is logical.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: reiner peterke
Дата:
Сообщение: PANIC: could not flush dirty data: Operation not permitted power8,Redhat Centos
Следующее
От: Tom Lane
Дата:
Сообщение: Useless code in RelationCacheInitializePhase3