Re: [HACKERS] Slow synchronous logical replication

Поиск

Список

Период

Сортировка

От	Konstantin Knizhnik
Тема	Re: [HACKERS] Slow synchronous logical replication
Дата	12 октября 2017 г. 11:24:57
Msg-id	5f5143cc-9f73-3909-3ef7-d3895cc6cc90@postgrespro.ru обсуждение
Ответ на	Re: [HACKERS] Slow synchronous logical replication (Craig Ringer <craig@2ndquadrant.com>)
Список	pgsql-hackers

Дерево обсуждения

On 12.10.2017 04:23, Craig Ringer wrote:

On 12 October 2017 at 00:57, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:

The reason of such behavior is obvious: wal sender has to decode huge
transaction generate by insert although it has no relation to this
publication.

It does. Though I wouldn't expect anywhere near the kind of drop you
report, and haven't observed it here.

Is the CREATE TABLE and INSERT done in the same transaction?

No. Table was create in separate transaction.
Moreover the same effect will take place if table is create before start of replication.
The problem in this case seems to be caused by spilling decoded transaction to the file by ReorderBufferSerializeTXN.
Please look at two profiles:
http://garret.ru/lr1.svg corresponds to normal work if pgbench with synchronous replication to one replica,
http://garret.ru/lr2.svg - the with concurrent execution of huge insert statement.

And here is output of pgbench (at fifth second insert is started):

progress: 1.0 s, 10020.9 tps, lat 0.791 ms stddev 0.232
progress: 2.0 s, 10184.1 tps, lat 0.786 ms stddev 0.192
progress: 3.0 s, 10058.8 tps, lat 0.795 ms stddev 0.301
progress: 4.0 s, 10230.3 tps, lat 0.782 ms stddev 0.194
progress: 5.0 s, 10335.0 tps, lat 0.774 ms stddev 0.192
progress: 6.0 s, 4535.7 tps, lat 1.591 ms stddev 9.370
progress: 7.0 s, 419.6 tps, lat 20.897 ms stddev 55.338
progress: 8.0 s, 105.1 tps, lat 56.140 ms stddev 76.309
progress: 9.0 s, 9.0 tps, lat 504.104 ms stddev 52.964
progress: 10.0 s, 14.0 tps, lat 797.535 ms stddev 156.082
progress: 11.0 s, 14.0 tps, lat 601.865 ms stddev 93.598
progress: 12.0 s, 11.0 tps, lat 658.276 ms stddev 138.503
progress: 13.0 s, 9.0 tps, lat 784.120 ms stddev 127.206
progress: 14.0 s, 7.0 tps, lat 870.944 ms stddev 156.377
progress: 15.0 s, 8.0 tps, lat 1111.578 ms stddev 140.987
progress: 16.0 s, 7.0 tps, lat 1258.750 ms stddev 75.677
progress: 17.0 s, 6.0 tps, lat 991.023 ms stddev 229.058
progress: 18.0 s, 5.0 tps, lat 1063.986 ms stddev 269.361

It seems to be effect of large transactions.
Presence of several channels of synchronous logical replication reduce performance, but not so much.
Below are results at another machine and pgbench with scale 10.

Configuraion	standalone	1 async logical replica	1 sync logical replca	3 async logical replicas	3 syn logical replicas
TPS	15k	13k	10k	13k	8k

Only partly true. The output plugin can register a transaction origin
filter and use that to say it's entirely uninterested in a
transaction. But this only works based on filtering by origins. Not
tables.

Yes I know about origin filtering mechanism (and we are using it in multimaster).
But I am speaking about standard pgoutput.c output plugin. it's pgoutput_origin_filter
always returns false.


I imagine we could call another hook in output plugins, "do you care
about this table", and use it to skip some more work for tuples that
particular decoding session isn't interested in. Skip adding them to
the reorder buffer, etc. No such hook currently exists, but it'd be an
interesting patch for Pg11 if you feel like working on it.

Unfortunately it is not quite clear how to make wal-sender smarter and let
him skip transaction not affecting its publication.

As noted, it already can do so by origin. Mostly. We cannot totally
skip over WAL, since we need to process various invalidations etc. See
ReorderBufferSkip.

The problem is that before end of transaction we do not know whether it touch this publication or not.
So filtering by origin will not work in this case.

I really not sure that it is possible to skip over WAL. But the particular problem with invalidation records etc can be solved by always processing this records by WAl sender.
I.e. if backend is inserting invalidation record or some other record which always should be processed by WAL sender, it can always promote LSN of this record to WAL sender.
So WAl sender will skip only those WAl records which is safe to skip (insert/update/delete records not affecting this publication).

I wonder if there can be some other problems with skipping part of transaction by WAL sender.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] Slow synchronous logical replication