Re: Skipping logical replication transactions on subscriber side

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Skipping logical replication transactions on subscriber side
Дата
Msg-id CAA4eK1K_yFm_vLyNi-unqJNYk1edNkuOrqjBRWVTF5bFaTSsQg@mail.gmail.com
обсуждение исходный текст
Ответ на RE: Skipping logical replication transactions on subscriber side  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Ответы RE: Skipping logical replication transactions on subscriber side  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Wed, Dec 1, 2021 at 8:24 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tues, Nov 30, 2021 9:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > > Shouldn't we someway check that the error message also starts with
> > > > "duplicate key value violates ..."?
> > >
> > > Yeah, I think it's a good idea to make the checks more specific. That
> > > is, probably we can specify the prefix of the error message and
> > > subrelid in addition to the current conditions: relid and xid. That
> > > way, we can check what error was reported by which workers (tablesync
> > > or apply) for which relations. And both check queries in
> > > test_subscription_error() can have the same WHERE clause.
> >
> > I've attached a patch that fixes this issue. Please review it.
> >
>
> I have a question about the testcase (I could be wrong here).
>
> Is it possible that the race condition happen between apply worker(test_tab1)
> and table sync worker(test_tab2) ? If so, it seems the error("replication
> origin with OID") could happen randomly until we resolve the conflict.
> Based on this, for the following code:
> -----
>     # Wait for the error statistics to be updated.
>     my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
>     $node->poll_query_until(
>         'postgres', $check_sql,
> ) or die "Timed out while waiting for statistics to be updated";
>
> * [1] *
>
>     $check_sql =
>         qq[
> SELECT subname, last_error_command, last_error_relid::regclass,
> last_error_count > 0 ] . $part_sql;
>     my $result = $node->safe_psql('postgres', $check_sql);
>     is($result, $expected, $msg);
> -----
>
> Is it possible that the error("replication origin with OID") happen again at the
> place [1]. In this case, the error message we have checked could be replaced by
> another error("replication origin ...") and then the test fail ?
>

Once we get the "duplicate key violation ..." error before * [1] * via
apply_worker then we shouldn't get replication origin-specific error
because the origin set up is done before starting to apply changes.
Also, even if that or some other happens after * [1] * because of
errmsg_prefix check it should still succeed. Does that make sense?

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "houzj.fnst@fujitsu.com"
Дата:
Сообщение: RE: Skipping logical replication transactions on subscriber side
Следующее
От: "houzj.fnst@fujitsu.com"
Дата:
Сообщение: RE: Skipping logical replication transactions on subscriber side