Обсуждение: Extensible Rmgr for Table AMs

Поиск

Список

Период

Сортировка

Extensible Rmgr for Table AMs

От

Jeff Davis

Дата:

09 ноября 2021 г., 02:36:21

Motivation:

I'm working on a columnar compression AM[0]. Currently, it uses generic
xlog, which works for crash recovery and physical replication, but not
logical decoding/replication.

Extensible rmgr would enable the table AM to support its own
redo/decode hooks and WAL format, so that it could support crash
recovery, physical replication, and logical replication.

Background:

I submitted another patch[0] to add new logical records, which could be
used to support logical decoding directly, without the need for
extensible rmgr and without any assumptions about the table AM. This
was designed to be easy to use, but inefficient. Amit raised
concerns[1] about whether it could meet the needs of zheap. Andres
suggested (off-list) that it would be better to just tackle the
extensible rmgr problem.

The idea for extensible rmgr has been proposed before[3]. The biggest
argument against it seemed to be that there was no complete use
case[4], so the worry was that something would be left out. Columnar is
complete enough that I think it qualifies as a good use case.

A subsequent proposal[5] was shot down because of a (potential?) need
for catalog access[6]. The attached patch does not use the catalog;
instead, it relies on table AM authors choosing IDs that don't conflict
with each other. This seems like a reasonable answer, considering that
there will likely be very few table AMs that go far enough to fully
support WAL including decoding.

Are there any other major arguments/objections that I missed?

Proposal:

The attached patch (against v14, so it's easier to test columnar) is
somewhat like a simplified version of [3] combined with refactoring to
make decoding a part of the rmgr.

* adds a new RmgrData method rm_decode
* refactors decode.c to use
the new method
* add a layer of indirection GetRmgr to find an rmgr

* fast path to find builtin rmgr in RmgrTable
* to find a custom
rmgr, traverses list of custom rmgrs that
are currently loaded
(unlikely to ever be more than a few)
* rmgr IDs from 0-127 are
reserved for builtin rmgrs
* rmgr IDs from 128-255 are reserved for
custom rmgrs
* table AM authors need to avoid collisions between
rmgr IDs

I have tested with columnar using a simple WAL format for logical
decoding only, and I'm still using generic xlog for recovery and
physical replication. I haven't tested the redo path, or how easy it
might be to do something like generic xlog.

Questions:

0. Do we want to go this route, or something simpler like my other
proposal, which introduces new logical record types[0]?

1. I am allocating the custom rmgr list in TopMemoryContext, and it
only works when loading as a part of shared_preload_libraries. This
avoids the need for shared memory in Simon's patch[3]. Is that the
right thing to do?

2. If we go this route, what do we do with generic xlog? It seems like
a half feature, since it doesn't work with logical decoding.

3. If the custom rmgr throws an error during redo, the server won't
start. Should we have a GUC to turn non-builtin redo into a no-op to
reduce the impact of bugs in the implementation of a custom rmgr?

4. Do we want to encourage index AMs to use this mechanism as well? I
didn't really look into how suitable it is, but at a high level it
seems reasonable.

Regards,
Jeff Davis

[0]
https://postgr.es/m/20ee0b0ae6958804a88fe9580157587720faf664.camel@j-davis.com
[1]
https://postgr.es/m/CAA4eK1JVDnbQ80ULdZuhzQkzr_yYhfON-tg%3Dd1U7aWjK_R1ixQ%40mail.gmail.com
[2] https://github.com/citusdata/citus/tree/master/src/backend/columnar
[3] https://postgr.es/m/1229541840.4793.79.camel%40ebony.2ndQuadrant
[4] https://postgr.es/m/20992.1232667957%40sss.pgh.pa.us
[5] https://postgr.es/m/1266774840.7341.29872.camel%40ebony
[6] https://postgr.es/m/26134.1266776040%40sss.pgh.pa.us

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Extensible Rmgr for Table AMs

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения