Re: WIP: Failover Slots

Поиск
Список
Период
Сортировка
От Oleksii Kliukin
Тема Re: WIP: Failover Slots
Дата
Msg-id 9B503EB5-676A-4258-9F78-27FC583713FE@hintbits.com
обсуждение исходный текст
Ответ на Re: WIP: Failover Slots  (Craig Ringer <craig@2ndquadrant.com>)
Ответы Re: WIP: Failover Slots  (Craig Ringer <craig@2ndquadrant.com>)
Список pgsql-hackers

On 23 Feb 2016, at 11:30, Craig Ringer <craig@2ndquadrant.com> wrote:

 
Updated patch 

... attached

I've split it up a bit more too, so it's easier to tell what change is for what and fixed the issues mentioned by Oleksii. I've also removed some unrelated documentation changes.

Patch 0001, timeline switches for logical decoding, is unchanged since the last post. 

Thank you, I read the user-interface part now, looks good to me.

I found the following issue when shutting down a master with a connected replica that uses a physical failover slot:

2016-02-23 20:33:42.546 CET,,,54998,,56ccb3f3.d6d6,3,,2016-02-23 20:33:07 CET,,0,DEBUG,00000,"performing replication slot checkpoint",,,,,,,,,""
2016-02-23 20:33:42.594 CET,,,55002,,56ccb3f3.d6da,4,,2016-02-23 20:33:07 CET,,0,DEBUG,00000,"archived transaction log file ""000000010000000000000003""",,,,,,,,,""
2016-02-23 20:33:42.601 CET,,,54998,,56ccb3f3.d6d6,4,,2016-02-23 20:33:07 CET,,0,PANIC,XX000,"concurrent transaction log activity while database system is shutting down",,,,,,,,,""
2016-02-23 20:33:43.537 CET,,,54995,,56ccb3f3.d6d3,5,,2016-02-23 20:33:07 CET,,0,LOG,00000,"checkpointer process (PID 54998) was terminated by signal 6: Abort trap",,,,,,,,,""
2016-02-23 20:33:43.537 CET,,,54995,,56ccb3f3.d6d3,6,,2016-02-23 20:33:07 CET,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,

Basically, the issue is that CreateCheckPoint calls CheckpointReplicationSlots, which currently produces WAL, and this violates the assumption at line xlog.c:8492

if (shutdown && checkPoint.redo != ProcLastRecPtr)
ereport(PANIC,
(errmsg("concurrent transaction log activity while database system is shutting down")));


There are a couple of incorrect comments

logical.c: 90
There's some things missing to allow this: I think it should be “There are some things missing to allow this:”

logical.c:93
"we need we would need”

slot.c:889
"and there's no latch to set, so poll” - clearly there is a latch used in the code below.

Also,  slot.c:301 emits an error message for an attempt to create a failover slot on the replica after acquiring and releasing the locks and getting the shared memory slot, even though all the data to check for this condition is available right at the beginning of the function. Shouldn’t it avoid the extra work if it’s not needed? 



--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
<0001-Allow-logical-slots-to-follow-timeline-switches.patch><0002-Allow-replication-slots-to-follow-failover.patch><0003-Retain-extra-WAL-for-failover-slots-in-base-backups.patch><0004-Add-the-UI-and-for-failover-slots.patch><0005-Document-failover-slots.patch><0006-Add-failover-to-pg_replication_slots.patch><0007-not-for-inclusion-Test-script-for-failover-slots.patch>

Kind regards,
--
Oleksii

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: proposal: schema PL session variables
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: Sanity checking for ./configure options?