Re: Physical replication slot advance is not persistent

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Physical replication slot advance is not persistent
Дата
Msg-id 20191226.173349.2159296357637287391.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: Physical replication slot advance is not persistent  (Alexey Kondratov <a.kondratov@postgrespro.ru>)
Ответы Re: Physical replication slot advance is not persistent  (Alexey Kondratov <a.kondratov@postgrespro.ru>)
Список pgsql-hackers
At Wed, 25 Dec 2019 20:28:04 +0300, Alexey Kondratov <a.kondratov@postgrespro.ru> wrote in 
> > Yep, it helps with physical replication slot persistence after
> > advance, but the whole validation (moveto <= endlsn) does not make
> > sense for me. The value of moveto should be >= than minlsn ==
> > confirmed_flush / restart_lsn, while endlsn == retlsn is also always
> > initialized with confirmed_flush / restart_lsn. Thus, your condition
> > seems to be true in any case, even if it was no-op one, which we were
> > intended to catch.
...
> If I get it correctly, then we already keep previous slot position in
> the minlsn, so we just have to compare endlsn with minlsn and treat
> endlsn <= minlsn as a no-op without slot state flushing.

I think you're right about the condition. (endlsn cannot be less than
minlsn, though) But I came to think that we shouldn't use locations in
that decision.

> Attached is a patch that does this, so it fixes the bug without
> affecting any user-facing behavior. Detailed comment section and DEBUG
> output are also added. What do you think now?
> 
> I have also forgotten to mention that all versions down to 11.0 should
> be affected with this bug.

pg_replication_slot_advance is the only caller of
pg_logical/physical_replication_slot_advacne so there's no apparent
determinant on who-does-what about dirtying and other housekeeping
calculation like *ComputeRequired*() functions, but the current shape
seems a kind of inconsistent between logical and physical.

I think pg_logaical/physical_replication_slot_advance should dirty the
slot if they actually changed anything. And
pg_replication_slot_advance should do the housekeeping if the slots
are dirtied.  (Otherwise both the caller function should dirty the
slot in lieu of the two.)

The attached does that.

regards.
-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 264b187aac17f0ce10e8748ea86fc967d08a0eb0 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Date: Thu, 26 Dec 2019 17:17:25 +0900
Subject: [PATCH] Make sure to save updated physical slot.

Back to 11, changes of a physical repliation slot made by
pg_replication_slot_advance doesn't survive restart. The cause was
pg_physical_replication_slot_advance forgot to dirty the slot when
updated. Fix it.
---
 src/backend/replication/slot.c      | 19 +++++++++++++++++++
 src/backend/replication/slotfuncs.c | 10 ++++++++--
 src/include/replication/slot.h      |  1 +
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 21ae8531b3..40149d0d99 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -672,6 +672,25 @@ ReplicationSlotMarkDirty(void)
     SpinLockRelease(&slot->mutex);
 }
 
+/*
+ * Return if the currently acquired slot is dirty.
+ */
+bool
+ReplicationSlotIsDirty(void)
+{
+    bool dirty;
+
+    ReplicationSlot *slot = MyReplicationSlot;
+
+    Assert(MyReplicationSlot != NULL);
+
+    SpinLockAcquire(&slot->mutex);
+    dirty = MyReplicationSlot->dirty;
+    SpinLockRelease(&slot->mutex);
+
+    return dirty;
+}
+
 /*
  * Convert a slot that's marked as RS_EPHEMERAL to a RS_PERSISTENT slot,
  * guaranteeing it will be there after an eventual crash.
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 6683fc3f9b..7b08ed691b 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -370,6 +370,13 @@ pg_physical_replication_slot_advance(XLogRecPtr moveto)
         MyReplicationSlot->data.restart_lsn = moveto;
         SpinLockRelease(&MyReplicationSlot->mutex);
         retlsn = moveto;
+
+        /*
+         * We don't need to dirty the slot only for the above change, but dirty
+         * this slot for the same reason with
+         * pg_logical_replication_slot_advance.
+         */
+        ReplicationSlotMarkDirty();
     }
 
     return retlsn;
@@ -574,9 +581,8 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
     nulls[0] = false;
 
     /* Update the on disk state when lsn was updated. */
-    if (XLogRecPtrIsInvalid(endlsn))
+    if (ReplicationSlotIsDirty())
     {
-        ReplicationSlotMarkDirty();
         ReplicationSlotsComputeRequiredXmin(false);
         ReplicationSlotsComputeRequiredLSN();
         ReplicationSlotSave();
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 3a5763fb07..f76b84571f 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -189,6 +189,7 @@ extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
 extern void ReplicationSlotMarkDirty(void);
+extern bool ReplicationSlotIsDirty(void);
 
 /* misc stuff */
 extern bool ReplicationSlotValidateName(const char *name, int elevel);
-- 
2.23.0


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Guillaume Lelarge
Дата:
Сообщение: Re: Expose lock group leader pid in pg_stat_activity
Следующее
От: Julien Rouhaud
Дата:
Сообщение: Re: Expose lock group leader pid in pg_stat_activity