Re: [HACKERS] Slow synchronous logical replication
От | Konstantin Knizhnik |
---|---|
Тема | Re: [HACKERS] Slow synchronous logical replication |
Дата | |
Msg-id | 5f5143cc-9f73-3909-3ef7-d3895cc6cc90@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: [HACKERS] Slow synchronous logical replication (Craig Ringer <craig@2ndquadrant.com>) |
Список | pgsql-hackers |
On 12.10.2017 04:23, Craig Ringer wrote:
On 12 October 2017 at 00:57, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:The reason of such behavior is obvious: wal sender has to decode huge transaction generate by insert although it has no relation to this publication.It does. Though I wouldn't expect anywhere near the kind of drop you report, and haven't observed it here. Is the CREATE TABLE and INSERT done in the same transaction?
No. Table was create in separate transaction.
Moreover the same effect will take place if table is create before start of replication.
The problem in this case seems to be caused by spilling decoded transaction to the file by ReorderBufferSerializeTXN.
Please look at two profiles:
http://garret.ru/lr1.svg corresponds to normal work if pgbench with synchronous replication to one replica,
http://garret.ru/lr2.svg - the with concurrent execution of huge insert statement.
And here is output of pgbench (at fifth second insert is started):
progress: 1.0 s, 10020.9 tps, lat 0.791 ms stddev 0.232
progress: 2.0 s, 10184.1 tps, lat 0.786 ms stddev 0.192
progress: 3.0 s, 10058.8 tps, lat 0.795 ms stddev 0.301
progress: 4.0 s, 10230.3 tps, lat 0.782 ms stddev 0.194
progress: 5.0 s, 10335.0 tps, lat 0.774 ms stddev 0.192
progress: 6.0 s, 4535.7 tps, lat 1.591 ms stddev 9.370
progress: 7.0 s, 419.6 tps, lat 20.897 ms stddev 55.338
progress: 8.0 s, 105.1 tps, lat 56.140 ms stddev 76.309
progress: 9.0 s, 9.0 tps, lat 504.104 ms stddev 52.964
progress: 10.0 s, 14.0 tps, lat 797.535 ms stddev 156.082
progress: 11.0 s, 14.0 tps, lat 601.865 ms stddev 93.598
progress: 12.0 s, 11.0 tps, lat 658.276 ms stddev 138.503
progress: 13.0 s, 9.0 tps, lat 784.120 ms stddev 127.206
progress: 14.0 s, 7.0 tps, lat 870.944 ms stddev 156.377
progress: 15.0 s, 8.0 tps, lat 1111.578 ms stddev 140.987
progress: 16.0 s, 7.0 tps, lat 1258.750 ms stddev 75.677
progress: 17.0 s, 6.0 tps, lat 991.023 ms stddev 229.058
progress: 18.0 s, 5.0 tps, lat 1063.986 ms stddev 269.361
It seems to be effect of large transactions.
Presence of several channels of synchronous logical replication reduce performance, but not so much.
Below are results at another machine and pgbench with scale 10.
Configuraion | standalone | 1 async logical replica | 1 sync logical replca | 3 async logical replicas | 3 syn logical replicas |
TPS | 15k | 13k | 10k | 13k | 8k |
Yes I know about origin filtering mechanism (and we are using it in multimaster).Only partly true. The output plugin can register a transaction origin filter and use that to say it's entirely uninterested in a transaction. But this only works based on filtering by origins. Not tables.
But I am speaking about standard pgoutput.c output plugin. it's pgoutput_origin_filter
always returns false.
The problem is that before end of transaction we do not know whether it touch this publication or not.I imagine we could call another hook in output plugins, "do you care about this table", and use it to skip some more work for tuples that particular decoding session isn't interested in. Skip adding them to the reorder buffer, etc. No such hook currently exists, but it'd be an interesting patch for Pg11 if you feel like working on it.Unfortunately it is not quite clear how to make wal-sender smarter and let him skip transaction not affecting its publication.As noted, it already can do so by origin. Mostly. We cannot totally skip over WAL, since we need to process various invalidations etc. See ReorderBufferSkip.
So filtering by origin will not work in this case.
I really not sure that it is possible to skip over WAL. But the particular problem with invalidation records etc can be solved by always processing this records by WAl sender.
I.e. if backend is inserting invalidation record or some other record which always should be processed by WAL sender, it can always promote LSN of this record to WAL sender.
So WAl sender will skip only those WAl records which is safe to skip (insert/update/delete records not affecting this publication).
I wonder if there can be some other problems with skipping part of transaction by WAL sender.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Konstantin KnizhnikДата:
Сообщение: Re: [HACKERS] Slow synchronous logical replication