Re: [HACKERS] WIP: Failover Slots

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: [HACKERS] WIP: Failover Slots
Дата
Msg-id CAMsr+YFqhhNUR+7ZYaWpAyHyEPDQ6CoBPtrD3H0gEkWpQDvCmA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] WIP: Failover Slots  (Thom Brown <thom@linux.com>)
Ответы Re: [HACKERS] WIP: Failover Slots  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 26 July 2017 at 00:16, Thom Brown <thom@linux.com> wrote:
On 8 April 2016 at 07:13, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 6 April 2016 at 22:17, Andres Freund <andres@anarazel.de> wrote:
>
>>
>> Quickly skimming 0001 in [4] there appear to be a number of issues:
>> * LWLockHeldByMe() is only for debugging, not functional differences
>> * ReplicationSlotPersistentData is now in an xlog related header
>> * The code and behaviour around name conflicts of slots seems pretty
>>   raw, and not discussed
>> * Taking spinlocks dependant on InRecovery() seems like a seriously bad
>>   idea
>> * I doubt that the archive based switches around StartupReplicationSlots
>>   do what they intend. Afaics that'll not work correctly for basebackups
>>   taken with -X, without recovery.conf
>>
>
> Thanks for looking at it. Most of those are my errors. I think this is
> pretty dead at least for 9.6, so I'm mostly following up in the hopes of
> learning about a couple of those mistakes.
>
> Good catch with -X without a recovery.conf. Since it wouldn't be recognised
> as a promotion and wouldn't increment the timeline, copied non-failover
> slots wouldn't get removed. I've never liked that logic at all anyway, I
> just couldn't think of anything better...
>
> LWLockHeldByMe() has a comment to the effect of: "This is meant as debug
> support only." So that's just a dumb mistake on my part, and I should've
> added "alreadyLocked" parameters. (Ugly, but works).
>
> But why would it be a bad idea to conditionally take a code path that
> acquires a spinlock based on whether RecoveryInProgress()? It's not testing
> RecoveryInProgress() more than once and doing the acquire and release based
> on separate tests, which would be a problem. I don't really get the problem
> with:
>
> if (!RecoveryInProgress())
> {
> /* first check whether there's something to write out */
> SpinLockAcquire(&slot->mutex);
> was_dirty = slot->dirty;
> slot->just_dirtied = false;
> SpinLockRelease(&slot->mutex);
>
>         /* and don't do anything if there's nothing to write */
>         if (!was_dirty)
>             return;
> }
>
> ... though I think what I really should've done there is just always dirty
> the slot in the redo functions.

Are there any plans to submit a new design/version for v11?

No. The whole approach seems to have been bounced from core. I don't agree and continue to think this functionality is desirable but I don't get to make that call.

If time permits I will attempt to update the logical decoding on standby patchset instead, and if possible add support for fast-forward logical decoding that does the minimum required to correctly maintain a slot's catalog_xmin and restart_lsn when advanced. But this won't be usable directly for failover like failover slots are, it'll require each application to keep track of standbys and maintain slots on them too. Or we'll need some kind of extension/helper to sync slot state.

In the mean time, 2ndQuadrant maintains an on-disk-compatible version of failover slots that's available for support customers.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: [HACKERS] pg_dump does not handle indirectly-granted permissions properly
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: [HACKERS] TAP: allow overriding PostgresNode in get_new_node