Re: Design for In-Core Logical Replication

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: Design for In-Core Logical Replication
Дата
Msg-id CAMsr+YHr1M7p4K9JtnLf8xwpH+d5hnJ2juU=V=EoLd+Ln0sYJw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Design for In-Core Logical Replication  (Simon Riggs <simon@2ndquadrant.com>)
Ответы Re: Design for In-Core Logical Replication  ("Joshua D. Drake" <jd@commandprompt.com>)
Re: Design for In-Core Logical Replication  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Список pgsql-hackers
On 21 July 2016 at 01:20, Simon Riggs <simon@2ndquadrant.com> wrote:
On 20 July 2016 at 17:52, Rod Taylor <rod.taylor@gmail.com> wrote:
 
I think it's important for communication channels to be defined separately from the subscriptions.

I agree and believe it will be that way.

Craig is working on allowing Replication Slots to failover between nodes, to provide exactly that requested behaviour.
 

First, I'd like to emphasise that logical replication has been stalled for ages now because we can no longer make forward progress on core features needed until we have in-core logical replication (they're dismissed as irrelevant, no in core users, etc) - but we have also had difficulty getting logical replication into core. To break this impasse we really need logical replication in core and need to focus on getting the minimum viable feature in place, not trying to make it do everything all at once. Point-to-point replication with no forwarding should be just fine for the first release. Lets not bog this in extra "must have" features that aren't actually crucial.

That said:

I had a patch in it for 9.6 to provide the foundations for logical replication to follow physical failover, but it got pulled at the last minute. It'll be submitted for 10.0 along with some other enhancements to make it usable without hacky extensions, most notably support for using a physical replication slot and hot standby feedback to pin a master's catalog_xmin where it's needed by slots on a physical replica.

That's for when we're combining physical and logical replication though, e.g. "node A" is a master/standby pair, and "node B" is also a master/standby pair.

For non-star logical topologies, which is what I think you might've been referring to, it's necessary to have:

- Node identity
- Which nodes we want to receive data from
- How we connect to each node

all of which are separate things. Who's out there, what we want from them, and how to get it.

pglogical doesn't really separate the latter two much at this point. Subscriptions identify both the node to connect to and the data we want to receive from a node; there's no selective data forwarding from one node to another. Though there's room for that in pglogical's hooks/filters by using filtering by replication origin, it just doesn't do it yet.

It sounds like that's what you're getting at. Wanting to be able to say "node A wants to get data from node B and node C" separately to "node A connects to node B to receive data", with the replication system somehow working out that that means data written from C to B should be forwarded to A.

Right?

If so, it's not always easy to figure that out. If you create connections to both B and C, we then have to automagically work out that we should stop forwarding data from C over our connection to B.

The plan with pglogical has been to allow connections to specify forwarding options, so the connection explicitly says what nodes it wants to get data from. It's users' job to ensure that they don't declare connections that overlap. This is simpler to implement, but harder to admin.

One challenge with either approach is ensuring a consistent switchover. If you have a single connection A=>B receiving data from [B,C], then you switch to two connections A=>B and A=>C with neither forwarding, you must ensure that the switchover occurs in such a way as that no data is replicated twice or skipped. That's made easier by the fact that we have replication origins and we can actually safely receive from both at the same time then discard from one of them, even use upstream filtering to avoid sending it over the wire twice. But it does take care and caution.

Note that none of this is actually for logical _failover_, where we lose a node. For that we need some extra help in the form of placeholder slots maintained on other peers. This can be done at the application / replication system level without the need for new core features, but it might not be something we can support in the first iteration.

I'm not sure how Petr's current design for in-core replication addresses this, if it does, or whether it's presently focused only on point-to-point replication like pglogical. As far as I'm concerned so long as it does direct point-to-point replication with no forwarding that's good enough for a first cut feature, so long as the UI, catalog and schema design leaves room for adding more later.
 
  
I also suspect multiple publications will be normal even if only 2 nodes. Old slow moving data almost always got different treatment than fast-moving data; even if only defining which set needs to hit the other node first and which set can trickle through later.

Agreed 


Yes, especially since we can currently only stream transactions one by one in commit order after commit.

Even once we have interleaved xact streaming, though, there will still be plenty of times we want to receive different sets of data from the same node at different priorities/rates. Small data we want to receive quickly, vs big data we receive when we get the chance to catch up. Of course it's necessary to define non-overlapping replication sets for this.

That's something we can already do in pglogical. I'm not sure if Petr is targeting replication set support as part of the first release of the in-core version of logical replication; they're necessary to do things like this.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: [BUG] pg_basebackup from disconnected standby fails