Re: [HACKERS] Logical Replication WIP

Поиск
Список
Период
Сортировка
От Petr Jelinek
Тема Re: [HACKERS] Logical Replication WIP
Дата
Msg-id ab78e303-777d-61c8-0b43-160f2bdaf853@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Logical Replication WIP  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Ответы Re: [HACKERS] Logical Replication WIP  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Список pgsql-hackers
On 20/01/17 17:23, Petr Jelinek wrote:
> On 20/01/17 15:08, Peter Eisentraut wrote:
>> On 1/19/17 5:01 PM, Petr Jelinek wrote:
>>> There were some conflicting changes committed today so I rebased the
>>> patch on top of them.
>>>
>>> Other than that nothing much has changed, I removed the separate sync
>>> commit patch, included the rename patch in the patchset and fixed the
>>> bug around pg_subscription catalog reported by Erik Rijkers.
>>
>> Committed.  I haven't reviewed the rename patch yet, so I'll get back to
>> that later.
>>
> 
> Hi,
> 
> Thanks!
> 
> Here is fix for the dependency mess.
> 

Álvaro pointed out off list couple of issues with how we handle
interruption of commands that connect to walsender.

a) The libpqwalreceiver.c does blocking connect so it's impossible to
cancel CREATE SUBSCRIPTION which is stuck on connect. This is btw
preexisting problem and applies to walreceiver as well. I rewrote the
connect function to use asynchronous API (patch 0001).

b) We can cancel in middle of the command (when stuck in
libpqrcv_PQexec) but the connection to walsender stays open which in
case we are waiting for snapshot can mean that it will stay idle in
transaction. I added PG_TRY wrapper which disconnects on error around
this (patch 0002).

And finally, while testing these two I found bug in walsender StringInfo
initialization (or lack there of). There are 3 static StringInfo buffers
that are initialized in WalSndLoop. Problem with that is that they can
be in some rare scenarios used from CreateReplicationSlot (and IMHO
StartLogicalReplication) before WalSndLoop is called which causes
segfault of walsender. This is rare because it only happens when
downstream closes connection during logical decoding initialization.

Since it's not exactly straight forward to find when these need to be
initialized based on commands, I decided to move the initialization code
to exec_replication_command() since that's always called before anything
so that makes it much less error prone (patch 0003).

The 0003 should be backpatched all the way to 9.4 where multiple
commands started using those buffers.

-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Valgrind-detected bug in partitioning code
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem