Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
Дата
Msg-id 912614.1601502758@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Ответы Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Список pgsql-bugs
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2020-Sep-30, Tom Lane wrote:
>> The question that this raises is how the heck did that get past
>> our test suites?  It seems like the error should have been obvious
>> to even the most minimal testing.

> ... yeah, that's indeed an important question.  I'm going to guess that
> the TAP suites are too forgiving :-(

One thing I noticed while trying to trace this down is that while the
initial table sync is happening, we have *both* a regular
walsender/walreceiver pair and a "sync" pair, eg

postgres  905650  0.0  0.0 186052 11888 ?        Ss   17:12   0:00 postgres: logical replication worker for
subscription16398  
postgres  905651 50.1  0.0 173704 13496 ?        Ss   17:12   0:09 postgres: walsender postgres [local] idle
postgres  905652  104  0.4 186832 148608 ?       Rs   17:12   0:19 postgres: logical replication worker for
subscription16398 sync 16393  
postgres  905653 12.2  0.0 174380 15524 ?        Ss   17:12   0:02 postgres: walsender postgres [local] COPY

Is it supposed to be like that?  Notice also that the regular walsender
has consumed significant CPU time; it's not pinning a CPU like the sync
walreceiver is, but it's eating maybe 20% of a CPU according to "top".
I wonder whether in cases with only small tables (which is likely all
that our tests test), the regular walreceiver manages to complete the
table sync despite repeated(?) failures of the sync worker.

            regards, tom lane



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
Следующее
От: ChandraKumar Ovanan
Дата:
Сообщение: Re: BUG #16636: Upper case issue in JSONB type