Re: Pluggable toaster

Поиск
Список
Период
Сортировка
От Nikita Malakhov
Тема Re: Pluggable toaster
Дата
Msg-id CAN-LCVP5zdz4s_MRn+a8f_65MY6Wjv+k_ocmS+8rzyO+y=bUSw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Pluggable toaster  (vignesh C <vignesh21@gmail.com>)
Ответы Re: Pluggable toaster  (vignesh C <vignesh21@gmail.com>)
Список pgsql-hackers
Hi!

Thank you for your attention.
I've rebased the patchset onto the latest master (from 07.01), but the second part is still
in pre-patch shape - it is working, but tests are failing due to changes in TOAST relations
logic - I haven't adapted 'em yet.

The patchset consists of 4 patches:
The first part consists of 2 patches and uses pg_attribute extension, this was an arguable
decision and it has serious downside, so I've decided to make another revision (the second part):
0001_toaster_interface_v25.patch - Pluggable TOAST API, reference TOAST left intact
0002_toaster_default_v25.patch - Reference TOAST is routed via TOAST API

The second part - TOAST API revision that does not extend pg_attribute, is more flexible
and allows a lot more extensibility to the TOAST functionality - instead of storing Toaster
OID in "atttoaster" attribute of pg_attribute - we use new special catalog table pg_toastrel,
which keeps all Toaster assignments history (pelase check my message from 27.12), and
allows to drop Toasters safely. This part is in pre-patch state, I've send it for the review
and feedback on the general approach:
0003_pg_toastrel_control_v25.patch - introduces pg_toastrel catalog relation, which stores
Toaster assignment logic;
0004_drop_toaster_v25.patch - extends SQL syntax with a safe DROP TOASTER command.



On Wed, Jan 4, 2023 at 12:52 PM vignesh C <vignesh21@gmail.com> wrote:
On Tue, 27 Dec 2022 at 02:32, Nikita Malakhov <hukutoc@gmail.com> wrote:
>
> Hi hackers!
>
>     Pluggable TOAST API with catalog control table PG_TOASTREL - pre-patch.
>
>     Pluggable TOAST - TOAST API rework - introduce PG_TOASTREL catalog
>     relation containing TOAST dependencies. NOTE: here is a pre-patch, not
>     a final version, just to introduce another approach to a Pluggable TOAST
>     idea, it needs some cleanup, tests rework and some improvements, so
> the main
>     goal of this message is to introduce this different approach. This is the
>     last patch and it is installed on top of older TOAST API patches, so here
>     are 3 patches attached:
>
>     0001_toaster_interface_v24.patch.gz
>     This patch introduces new custom TOAST pointer, Pluggable TOAST API and
>     Toaster support functions - cache, lookup, and new attribute 'atttoaster'
>     in PG_ATTRIBUTE table which stores Toaster OID;
>
>     0002_toaster_default_v24.patch.gz
>     Here the default TOAST mechanics is routed via TOAST API, but still using
>     varatt_external TOAST Pointer - so this step does not change overall TOAST
>     mechanics unless you plug in some custom Toaster;
>
>     0003_pg_toastrel_table_v24.patch.gz
>     Here Pluggable TOAST is reworked not to modify PG_ATTRIBUTE, instead this
>     patch introduces new catalog table PG_TOASTREL with its support functions.
>
>     Motivation: PG_ATTRIBUTE is already the largest catalog table. We try
>     to avoid modification of existing catalog tables, and previous solution
>     had several problems:
>     1) New field in PG_ATTRIBUTE;
>     2) No opportunity to save all Toaster assignment history;
>     3) No opportunity to have multi-TOAST tables assigned to a relation or
>     an attribute;
>     4) Toaster cannot be dropped - to drop Toaster we need to scan all tables
>     with TOASTable columns.
>
>     Instead of extending PG_ATTRIBUTE with ATTTOASTER attribute, we decided
>     to store all Table-Toaster relations in a new catalog table PG_TOASTREL.
>     This cancels the necessity to modify catalog table PG_ATTRIBUTE, allows to store
>     full history of Toasters assignments, and allows to drop unused Toasters
>     from system.
>
>     Toasters are assigned to a table column. ALTER TABLE ... SET TOASTER command
>     creates a new row in PG_TOASTREL. To distinguish sequential assignments,
>     PG_TOASTREL has special attribute - 'version'. With each new assignment
>     its 'version' attribute is increased, and the row with the biggest 'version'
>     is the current Toaster for a column.
>
>     This approach allows to provide different behavior, even for a single table
>     we can have one TOAST table for the whole relation (as it is in current TOAST
>     mechanics), or we can have separate TOAST relation(s) for each TOASTable
>     column - this requires a slight modification if current approach. The latter
>     also allows simple invariant of column-oriented storage.
>
>     Also, this approach makes PG_ATTRIBUTE attribute RELTOASTRELID obsolete -
>     current mechanics allows only 1 TOAST table for relation, which limits
>     greatly TOAST capabilities - because all TOASTed columns are stored in this
>     table, which in its turn limits overall base relation capacity.
>
>     In future, this approach allows us to have a kind of near infinite TOAST
>     storage, with ability to store large values (larger than 512 Mbytes),
>     auto-creation of TOAST table only when the first value is actually TOASTed,
>     and much more.
>
>     The approach, along with the TOAST API itself, introduces the catalog table
>     PG_TOASTREL with a set of support functions.
>
>     PG_TOASTREL definition:
>
>     postgres@postgres=# \d+ pg_toastrel;
>                                                 Table "pg_catalog.pg_toastrel"
>        Column    |   Type   | Collation | Nullable | Default | Storage | Toaster | Compression | Stats target | Description
>     -------------+----------+-----------+----------+---------+---------+---------+-------------+--------------+-------------
>     oid          | oid      |           | not null |         | plain   |         |             |              |
>     toasteroid   | oid      |           | not null |         | plain   |         |             |              |
>     relid        | oid      |           | not null |         | plain   |         |             |              |
>     toastentid   | oid      |           | not null |         | plain   |         |             |              |
>     attnum       | smallint |           | not null |         | plain   |         |             |              |
>     version      | smallint |           | not null |         | plain   |         |             |              |
>     relname      | name     |           | not null |         | plain   |         |             |              |
>     toastentname | name     |           | not null |         | plain   |         |             |              |
>     flag         | "char"   |           | not null |         | plain   |         |             |              |
>     toastoptions | "char"   |           | not null |         | plain   |         |             |              |
>     Indexes:
>     "pg_toastrel_oid_index" PRIMARY KEY, btree (oid)
>     "pg_toastrel_name_index" UNIQUE CONSTRAINT, btree (toasteroid, relid, version, attnum)
>     "pg_toastrel_rel_index" btree (relid, attnum)
>     "pg_toastrel_tsr_index" btree (toasteroid)
>     Access method: heap
>    (This is not a final definition)
>
>     Where:
>     oid - PG_TOASTREL record ID
>     toasteroid - Toaster OID from PG_TOASTER
>     relid - base relation OID
>     toastentid - TOAST entity OID (not necessary to be a table)
>     attnum - TOASTable attribute index in base relation
>     version - Toaster assignment version - sequence of assignments
>     relname - base relation name (optional)
>     toastentname - TOAST entity name (optional)
>     flag - special field to mark rows, currently only the value 'x' is used
>     to mark unused rows
>
>     PG_TOASTREL unique key consists of:
>     toasteroid, relid, attnum, version
>
>     All currently assigned Toasters are additionally stored in cache for
>     fast access. When new row is being TOASTed - Toaster, relation Oid,
>     TOAST relation Oid, column index are added into Toastrel Cache for fast
>     access.
>
>     Create table, change Toaster, change column type were changed to
>     add new rows in PG_TOASTREL, to use this table and cache instead
>     of altering pg_attribute with new column. For table creation from
>     scratch when no TOAST tables were created is used special condition
>     with version=0.
>
>     DROP TABLE drops rows in PG_TOASTREL for this table. This allows to -
>     DROP TOASTER command added. When no rows with the according Toaster are
>     present in PG_TOASTREL - it is considered unused and thus could be safely
>     dropped from the system.
>
>     Default toaster 'deftoaster' (reference TOAST mechanics) cannot be dropped.
>
> Working branch:
> https://github.com/postgrespro/postgres/tree/toastapi_with_ctl
>
> Would be glad to get any proposals and objections.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
patching file src/backend/utils/cache/syscache.c
=== Applying patches on top of PostgreSQL commit ID
33ab0a2a527e3af5beee3a98fc07201e555d6e45 ===
=== applying patch ./0001_toaster_interface_v24.patch
patching file contrib/test_decoding/expected/ddl.out
Hunk #2 FAILED at 874.
1 out of 2 hunks FAILED -- saving rejects to file
src/backend/utils/cache/syscache.c.rej

[1] - http://cfbot.cputube.org/patch_41_3490.log

Regards,
Vignesh


--
Regards,
Nikita Malakhov
Postgres Professional 
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Joseph Koshakow
Дата:
Сообщение: Re: Infinite Interval
Следующее
От: Nikita Malakhov
Дата:
Сообщение: Re: How to define template types in PostgreSQL