Re: logical decoding issue with concurrent ALTER TYPE

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: logical decoding issue with concurrent ALTER TYPE
Дата
Msg-id CAA4eK1+8Kyp+0f7bvZmfW6NWNzYfgbFp-e9vGjkhx_mQ_RoT-g@mail.gmail.com
обсуждение исходный текст
Ответ на logical decoding issue with concurrent ALTER TYPE  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Sun, Aug 6, 2023 at 11:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> A colleague Drew Callahan (in CC) has discovered that the logical
> decoding doesn't handle syscache invalidation messages properly that
> are generated by other transactions. Here is example (I've attached a
> patch for isolation test),
>
> -- Setup
> CREATE TYPE comp AS (f1 int, f2 text);
> CREATE TABLE tbl3(val1 int, val2 comp);
> SELECT pg_create_logical_replication_slot('s', 'test_decoding');
>
> -- Session 1
> BEGIN;
> INSERT INTO tbl3 (val1, val2) VALUES (1, (1, '2'));
>
>     -- Session 2
>     ALTER TYPE comp ADD ATTRIBUTE f3 int;
>
> INSERT INTO tbl3 (val1, val2) VALUES (1, (1, '2', 3));
> COMMIT;
>
> pg_logical_slot_get_changes() returns:
>
> BEGIN
> table public.tbl3: INSERT: val1[integer]:1 val2[comp]:'(1,2)'
> table public.tbl3: INSERT: val1[integer]:1 val2[comp]:'(1,2)'
> COMMIT
>
> However, the logical decoding should reflect the result of ALTER TYPE
> and the val2 of the second INSERT output should be '(1,2,3)'.
>
> The root cause of this behavior is that while ALTER TYPE can be run
> concurrently to INSERT, the logical decoding doesn't handle cache
> invalidation properly, and it got a cache hit of stale data (of
> typecache in this case). Unlike snapshots that are stored in the
> transaction’s reorder buffer changes, the invalidation messages of
> other transactions are not distributed. As a result, the snapshot
> becomes moot when we get a cache hit of stale data due to not
> processing the invalidation messages again. This is not an issue for
> ALTER TABLE and the like due to 2 phase locking and taking an
> AccessExclusive lock. The issue with DMLs and ALTER TYPE has been
> discovered but there might be other similar cases.
>
> As far as I tested, this issue has existed since v9.4, where the
> logical decoding was introduced, so it seems to be a long-standing
> issue.
>
> The simplest fix would be to invalidate all caches when setting a new
> historical snapshot (i.e. applying CHANGE_INTERNAL_SNAPSHOT) but we
> end up invalidating other caches unnecessarily too.
>

I think this could be quite onerous for workloads having a mix of DDL/DMLs.

> A better fix would be that when decoding the commit of a catalog
> changing transaction, we distribute invalidation messages to other
> concurrent transactions, like we do for snapshots. But we might not
> need to distribute all types of invalidation messages, though.
>

I would prefer the solution on these lines. It would be good if we
distribute only the required type of invalidations.

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: pgbnech: allow to cancel queries during benchmark
Следующее
От: Amit Langote
Дата:
Сообщение: Re: initial pruning in parallel append