Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop

Поиск
Список
Период
Сортировка
От Petr Jelinek
Тема Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
Дата
Msg-id 4a37c0e0-88a5-5d09-19c6-390b8412d3e6@2ndquadrant.com
обсуждение исходный текст
Ответ на BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop  (PG Bug reporting form <noreply@postgresql.org>)
Список pgsql-bugs
Hi,

On 14/10/2020 03:12, Alvaro Herrera wrote:
> On 2020-Oct-12, Petr Jelinek wrote:
> 
>>> However, and this is one reason why I'd welcome Petr/Peter thoughts on
>>> this, I don't really understand what happens in LogicalRepApplyLoop
>>> afterwards with a tablesync worker; are we actually doing anything
>>> useful there, considering that the actual data copy seems to have
>>> occurred in the CopyFrom() call in copy_table?  In other words, by the
>>> time we return control to ApplyWorkerMain with a slot name, isn't the
>>> work all done, and the only thing we need is to synchronize protocol and
>>> close the connection?
>>
>> There are 2 possible states at that point, either tablesync is ahead (when
>> main apply lags or nothing is happening on publication side) or it's behind
>> the main apply. When tablesync is ahead we are indeed done and just need to
>> update the state of the table (which is what the code you removed did, but
>> LogicalRepApplyLoop should do it as well, just a bit later). When it's
>> behind we need to do catchup for that table only which still happens in the
>> tablesync worker. See the explanation at the beginning of tablesync.c, it
>> probably needs some small adjustments after the changes in your first patch.
> 
> ... Ooh, things start to make some sense now.  So how about the
> attached?  There are some not really related cleanups.  (Changes to
> protocol.sgml are still pending.)
> 

It would be nice if the new sentences at the beginning of tablesync.c 
started with uppercase, but that's about as nitpicky as I can be :)

> If I understand correcly, the early exit in tablesync.c is not saving *a
> lot* of time (we don't actually skip replaying any WAL), even if it's
> saving execution of a bunch of code.  So I stand by my position that
> removing the code is better because it's clearer about what is actually
> happening.
> 

I don't really have any problems with the simplification you propose. 
The saved time is probably in order of hundreds of ms which for table 
sync is insignificant.

-- 
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted
Следующее
От: "denis.patron"
Дата:
Сообщение: Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted