Re: [HACKERS] Dealing with logical replication

Поиск
Список
Период
Сортировка
От Tatsuo Ishii
Тема Re: [HACKERS] Dealing with logical replication
Дата
Msg-id 20170720.162423.1757531165050255607.t-ishii@sraoss.co.jp
обсуждение исходный текст
Ответ на Re: [HACKERS] Dealing with logical replication  (Craig Ringer <craig@2ndquadrant.com>)
Список pgsql-hackers
> On 19 July 2017 at 16:34, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
> 
>> Now that we are going to have logical replication in PostgreSQL 10, I
>> have started thinking how Pgpool-II can deal with it. For example, the
>> logical replication does not replicate DDLs. Isn't it convenient for
>> users to do it automatically in Pgpool-II? Or even doing it for
>> TRUNCATE?
>>
>> Or are they against the design philosophy of the logical replication?
>>
> 
> (Disclaimer - Petr Jelinek and Peter Eisentraut were the main ones working
> on in in-core logical rep, and I haven't had time to play with it much).
> 
> TL;DR: a pooler can only solve a limited subset of the problem, fairly
> unreliably, and this is really something that needs further work in core.
> 
> Not replicating TRUNCATE and schema changes is more an implementation
> limitation than anything else. It's desirable to get to the point where
> logical replication can transparently replicate DDL. There are some hurdles
> for doing so, but it'll be possible to get something in place that's good
> enough when time/release progress permits.
> 
> Similarly, with TRUNCATE, AFAIK support just didn't make the cut for pg10.
> 
> A pooler could well help in the mean time, but you have to consider
> ordering with care. For example, given "U" upstream, "D" downstream:
> 
> U: CREATE TABLE x (a integer, b integer);
> D: CREATE TABLE x (a); -- user script syncs
> U: INSERT INTO x (a,b) VALUES (1,0);
> D: [applies INSERT 1,0]
> U: INSERT INTO x (a,b) VALUES (2,0);
> U: ALTER TABLE x DROP COLUMN b;
> D: ALTER TABLE x DROP COLUMN b; -- user script syncs
> U: INSERT INTO x (a) VALUES (3);
> D: [ERROR on INSERT of 2,0: no column 'b']
> 
> Because the DDL here is transported out of band vs the row data, you can
> easily create situations where the schema change is applied before the last
> committed-but-not-yet-replicated data from the upstream that was based on
> the old schema.

What I am thinking for now is, do not allow to issue DDLs on the
downstream.

> To achieve correct ordering, the simplest approach is to record DDL in a
> table when you perform it on the upstream, and replay it when you see rows
> in that table appear on the downstream.  You know it's safe to replay it
> now. This is the essence of what BDR and pglogical do with their DDL
> replication, but they handle DDL in the middle of transactions that also
> make row data changes by intercepting writes to the queue table and
> performing the DDL at the exact point in the transaction where it happened
> on the upstream. I don't think that's possible with the in-core logical
> replication yet, and certainly not something a pooler can do.
> 
> To do it externally, you have to take note of when a schema change happened
> on the upstream and apply it on the downstream at or after the point where
> the downstream has replayed and confirmed up to the upstream lsn where the
> schema change happened. Then apply the schema change.
> 
> A pooler trying to help here must also be very aware of the impact of
> multi-statements. If you send a single simple query message with a mixture
> of schema change commands and normal DML, you probably don't want to repeat
> the DML on downstream nodes or you'll get duplicate rows etc. But ... by
> unless it embeds transaction control commands, a simple query message
> executes in a single implicit transaction, so if you extract just the DDL
> you'll again have ordering issues of upstream vs downstream.

Pgpool-II will avoid the problem by rejecting such multi-statements.

> There are even a variety of difficulties to overcome with doing it in core:
> event triggers don't capture ddl command text and have no facility to turn
> the internal command representation back into SQL command text, nor do we
> have any way to turn the internal representation back into a parsenode tree
> for execution on a downstream's standard_ProcessUtility. However, we can
> capture raw command text with ProcessUtility_hook now that we have
> byte-ranges for the query-parts of a multi-part query (yay!), and that
> works well enough if you also capture the active search_path and apply with
> the same search_path. It can match the wrong object if extra objects with
> the same name are present earlier on the search_path on the downstream than
> on the upstream, so it's not ideal, but that's a weird corner case.
> 
> If we had a hook in the logical apply worker's insert or wal-message
> routines it'd be possible to write an extension to do this for pg10, but
> AFAICS we don't.
> 
> So schema changes in logical replication currently require more care than
> in physical replication.

Thank you for the explanation regarding those difficulties.

I now understand that the reason why logical replication does not
support DDLs and TRUNCATE. The reason is, not by design decision but
by the difficulties of implementation.

Probably that means those limitations will be solved in the
future. Then the motivation for Pgpool-II to implement them would not
be high.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Neha Sharma
Дата:
Сообщение: Re: [HACKERS] [TRAP: FailedAssertion] causing server to crash
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: [HACKERS] [TRAP: FailedAssertion] causing server to crash