Re: [HACKERS] Logical Replication WIP

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [HACKERS] Logical Replication WIP
Дата
Msg-id 20161213003325.npmnfx2znwnoffkf@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [HACKERS] Logical Replication WIP  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Список pgsql-hackers
HJi,

On 2016-12-12 09:18:48 -0500, Peter Eisentraut wrote:
> On 12/8/16 4:10 PM, Petr Jelinek wrote:
> > On 08/12/16 20:16, Peter Eisentraut wrote:
> >> On 12/6/16 11:58 AM, Peter Eisentraut wrote:
> >>> On 12/5/16 6:24 PM, Petr Jelinek wrote:
> >>>> I think that the removal of changes to ReplicationSlotAcquire() that you
> >>>> did will result in making it impossible to reacquire temporary slot once
> >>>> you switched to different one in the session as the if (active_pid != 0)
> >>>> will always be true for temp slot.
> >>>
> >>> I see.  I suppose it's difficult to get a test case for this.
> >>
> >> I created a test case, saw the error of my ways, and added your code
> >> back in.  Patch attached.
> >>
> >
> > Hi,
> >
> > I am happy with this version, thanks for moving it forward.
>
> committed

Hm.
/*
+ * Cleanup all temporary slots created in current session.
+ */
+void
+ReplicationSlotCleanup()

I'd rather see a (void) there. The prototype has it, but still.


+
+    /*
+     * No need for locking as we are only interested in slots active in
+     * current process and those are not touched by other processes.

I'm a bit suspicious of this claim.  Without a memory barrier you could
actually look at outdated versions of active_pid. In practice there's
enough full memory barriers in the slot creation code that it's
guaranteed to not be the same pid from before a wraparound though.

I think that doing iterations of slots without
ReplicationSlotControlLock makes things more fragile, because suddenly
assumptions that previously held aren't true anymore.   E.g. factually/* * The slot is definitely gone.  Lock out
concurrentscans of the array * long enough to kill it.  It's OK to clear the active flag here without * grabbing the
mutexbecause nobody else can be scanning the array here, * and nobody can be attached to this slot and thus access it
without* scanning the array. */
 
is now simply not true anymore.  It's probably not harmfully broken, but
at least you've changed the locking protocol without adapting comments.

/*
- * Permanently drop the currently acquired replication slot which will be
- * released by the point this function returns.
+ * Permanently drop the currently acquired replication slot. */static voidReplicationSlotDropAcquired(void)

Isn't that actually removing interesting information? Yes, the comment's
been moved to ReplicationSlotDropPtr(), but that routine is an internal
one...


@@ -810,6 +810,9 @@ ProcKill(int code, Datum arg)    if (MyReplicationSlot != NULL)        ReplicationSlotRelease();

+    /* Also cleanup all the temporary slots. */
+    ReplicationSlotCleanup();
+

So we now have exactly this code in several places. Why does a
generically named Cleanup routine not also deal with a currently
acquired slot? Right now it'd be more appropriately named
ReplicationSlotDropTemporary() or such.


@@ -1427,13 +1427,14 @@ pg_replication_slots| SELECT l.slot_name,    l.slot_type,    l.datoid,    d.datname AS
database,
+    l.temporary,    l.active,    l.active_pid,    l.xmin,    l.catalog_xmin,    l.restart_lsn,
l.confirmed_flush_lsn
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, active, active_pid, xmin, catalog_xmin,
restart_lsn,confirmed_flush_lsn)
 
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin,
catalog_xmin,restart_lsn, confirmed_flush_lsn)     LEFT JOIN pg_database d ON ((l.datoid = d.oid)));pg_roles| SELECT
pg_authid.rolname,   pg_authid.rolsuper,
 

If we start to expose this, shouldn't we expose the persistency instead
(i.e. persistent/ephemeral/temporary)?


new file   contrib/test_decoding/sql/slot.sql
@@ -0,0 +1,20 @@
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot_p', 'test_decoding');
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot_t', 'test_decoding', true);
+
+SELECT pg_drop_replication_slot('regression_slot_p');
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot_p', 'test_decoding', false);
+
+-- reconnect to clean temp slots
+\c

Can we add multiple slots to clean up here? Can we also add a test for
the cleanup on error for temporary slots? E.g. something like in
ddl.sql (maybe we should actually move some of the relevant tests from
there to here).

It'd also be good to test this with physical slots?


+-- test switching between slots in a session
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot1', 'test_decoding', true);
+SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot2', 'test_decoding', true);
+SELECT * FROM pg_logical_slot_get_changes('regression_slot1', NULL, NULL);
+SELECT * FROM pg_logical_slot_get_changes('regression_slot2', NULL, NULL);

Can we actually output something? Right now this doesn't test that
much...


- Andres



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] exposing wait events for non-backends (was: Trackingwait event for latches)
Следующее
От: Nikita Glukhov
Дата:
Сообщение: [HACKERS] PATCH: recursive json_populate_record()