Re: Applying logical replication changes by more than one process

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: Applying logical replication changes by more than one process
Дата
Msg-id CAMsr+YFEnKi0mSrskFt7QHgwQzDGDKMi6frO693PEPPmWdozdA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Applying logical replication changes by more than one process  (konstantin knizhnik <k.knizhnik@postgrespro.ru>)
Ответы Re: Applying logical replication changes by more than one process
Список pgsql-hackers
On 22 March 2016 at 14:32, konstantin knizhnik <k.knizhnik@postgrespro.ru> wrote:
 
Ah you mean because with wal_log=true the origin advance is in different WAL record than commit? OK yeah you might be one transaction behind then, true.

It actually means that we can not enforce database consistency. If we do replorigin_advance  before commit and then crash happen, then we will loose some changes.
If we call replorigin_advance after commit but crash happen before, then some changes can be applied multiple times. For example we can insert some record twice (if there are no unique constraints).
Look likes the only working scenario is to setup replication session for each commit and use locking to prevent concurrent session setup for the same slot by multiple process,  doesn't it?

Yes.

How would you expect it to work if you attempted to replorigin_advance without a session? From multiple concurrent backends?

Parallel apply is complicated business. You have to make sure you apply xacts in an order that's free from deadlocks and from insert/delete anomalies - though you can at least detect those, ERROR that xact and all subsequent ones, and retry. For progress tracking to be consistent and correct you'd have to make sure you committed strictly in the same order as upstream. Just before each commit you can set the origin LSN and advance the replication origin, which will commit atomically along with the commit it confirms. I don't really see the problem.
 
I have tried it, fortunately it doesn't cause any noticeable performance degradation. But unfortunately  can't consider such approach as elegant.
Why it is actually necessary to bind replication slot to process? Why it is not possible to have multiple concurrent sessions for the same slot?

Especially since most slot changes LWLock- and/or spinlock-protected already.

The client would have to manage replay confirmations appropriately so that it doesn't confirm past the point where some other connection still needs it.

We'd have to expose a "slot" column in pg_stat_replication and remove the "pid" column from pg_replication_slots to handle the 1:n relationship between slot clients and slots, and it'd be a pain to show which normal user backends were using a slot. Not really sure how to handle that.

To actually make this useful would require a lot more though. A way to request that replay start from a new LSN without a full disconnect/reconnect each time. Client-side parallel consume/apply. Inter-transaction ordering information so the client can work out a viable xact apply order (possibly using SSI information per the discussion with Kevin?). Etc.

I haven't really looked into this and I suspect there are some hairy areas involved in replaying a slot from more than one client. The reason I'm interested in it personally is for initial replica state setup as Oleksandr prototyped and described earlier. We could attach to the slot's initial snapshot then issue a new replication command that, given a table name or oid, scans the table from the snapshot and passes each tuple to a new callback (like, but not the same as, the insert callback) on the output plugin.

That way clients could parallel-copy the initial state of the DB across the same replication protocol they then consume new changes from, with no need to make normal libpq connections and COPY initial state.

I'm interested in being able to do parallel receive of new changes from the slot too, but suspect that'd be a bunch harder.

  
Also I concern about using sequential search for slot location in replorigin_session_setup and many other functions - there is loop through all  max_replication_slots.
It seems to be not a problem when number of slots is less than 10. For multimaster this assumption is true - even Oracle RAC rarely has two-digit number of nodes.
But if we want to perform sharding and use logical replication for providing redundancy, then number of nodes and slots can be essentially larger.

Sounds like premature optimisation. Deal with it if it comes up in profiles in scale testing with 100 clients. I'll be surprised if it does.


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: Re: Speed up Clog Access by increasing CLOG buffers
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: OOM in libpq and infinite loop with getCopyStart()