Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions

Поиск
Список
Период
Сортировка
От Alexey Kondratov
Тема Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Дата
Msg-id 126e3a30-a7f5-ede4-2bd0-b217749c8b33@postgrespro.ru
обсуждение исходный текст
Ответ на Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On 18.12.2018 1:28, Tomas Vondra wrote:
>> 4) There was a problem with marking top-level transaction as having
>> catalog changes if one of its subtransactions has. It was causing a
>> problem with DDL statements just after subtransaction start (savepoint),
>> so data from new columns is not replicated.
>>
>> 5) Similar issue with schema send. You send schema only once per each
>> sub/transaction (IIRC), while we have to update schema on each catalog
>> change: invalidation execution, snapshot rebuild, adding new tuple cids.
>> So I ended up with adding is_schema_send flag to ReorderBufferTXN, since
>> it is easy to set it inside RB and read in the output plugin. Probably,
>> we have to choose a better place for this flag.
>>
> Hmm. Can you share an example how to trigger these issues?

Test cases inside 014_stream_tough_ddl.pl and old ones (with 
streaming=true option added) should reproduce all these issues. In 
general, it happens in a txn like:

INSERT
SAVEPOINT
ALTER TABLE ... ADD COLUMN
INSERT

then the second insert may discover old version of catalog.

> Interesting. Any idea where does the extra overhead in this particular
> case come from? It's hard to deduce that from the single flame graph,
> when I don't have anything to compare it with (i.e. the flame graph for
> the "normal" case).

I guess that bottleneck is in disk operations. You can check 
logical_repl_worker_new_perf.svg flame graph: disk reads (~9%) and 
writes (~26%) take around 35% of CPU time in summary. To compare, 
please, see attached flame graph for the following transaction:

INSERT INTO large_text
SELECT (SELECT string_agg('x', ',')
FROM generate_series(1, 2000)) FROM generate_series(1, 1000000);

Execution Time: 44519.816 ms
Time: 98333,642 ms (01:38,334)

where disk IO is only ~7-8% in total. So we get very roughly the same 
~x4-5 performance drop here. JFYI, I am using a machine with SSD for tests.

Therefore, probably you may write changes on receiver in bigger chunks, 
not each change separately.

> So I'm not particularly worried, but I'll look into that. I'd be much
> more worried if there was measurable overhead in cases when there's no
> streaming happening (either because it's disabled or the memory limit
> was not hit).

What I have also just found, is that if a table row is large enough to 
be TOASTed, e.g.:

INSERT INTO large_text
SELECT (SELECT string_agg('x', ',')
FROM generate_series(1, 1000000)) FROM generate_series(1, 1000);

then logical_work_mem limit is not hit and we neither stream, nor spill 
to disk this transaction, while it is still large. In contrast, the 
transaction above (with 1000000 smaller rows) being comparable in size 
is streamed. Not sure, that it is easy to add proper accounting of 
TOAST-able columns, but it worth it.

-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Filip Rembiałkowski
Дата:
Сообщение: dropdb --force
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: [HACKERS] logical decoding of two-phase transactions