[PATCH] Opclass parameters

Поиск
Список
Период
Сортировка
От Nikita Glukhov
Тема [PATCH] Opclass parameters
Дата
Msg-id d22c3a18-31c7-1879-fc11-4c1ce2f5e5af@postgrespro.ru
обсуждение исходный текст
Ответы Re: [PATCH] Opclass parameters
Список pgsql-hackers
Hi hackers.

I would like to present patch set implementing opclass parameters.

This feature was recently presented at pgconf.ru:
http://www.sai.msu.su/~megera/postgres/talks/opclass_pgconf.ru-2018.pdf

A analogous work was already done by Nikolay Shaplov two years ago:
https://www.postgresql.org/message-id/5213596.TqFRiqmCTe%40nataraj-amd64
But this patches are not based on it, although they are very similar.


Opclass parameters can give user ability to:
  * Define the values of the constants that are hardcoded now in the opclasses
    depending on the indexed data.
  * Specify what to index for non-atomic data types (arrays, json[b], tsvector).
    Partial index can only filter whole rows.
  * Specify what indexing algorithm to use depending on the indexed data.


Description of patches:

1. Infrastructure for opclass parameters.

SQL grammar is changed only for CREATE INDEX statement: parenthesized parameters
in reloptions format are added after column's opclass name.  Default opclass can
be specified with DEFAULT keyword:

CREATE INDEX idx ON tab USING am (
    {expr {opclass | DEFAULT} ({name=value} [,...])} [,...]
);

Example for contrib/intarray:

CREATE INDEX ON arrays USING gist (
   arr gist__intbig_ops (siglen = 32),
   arr DEFAULT (numranges = 100)
);

\d arrays
                 Table "public.arrays"
  Column |   Type    | Collation | Nullable | Default
--------+-----------+-----------+----------+---------
  arr    | integer[] |           |          |
Indexes:
     "arrays_arr_arr1_idx" gist (arr gist__intbig_ops (siglen='32'), arr gist__int_ops (numranges='100'))


I decided to store parameters in text[] column pg_index.indoptions near to
existing columns like indkey, indcollation, indclass, indoption. I-th element of
indoptions[] is a text array of parameters of i-th index column serialized into
a string.  Each parameter is stored as 'name=value' text string like ordinal
reloptions.  There is another way to store opclass parameters: store them in
the existing column pg_attribute.attoptions (as it was done by Nikolay Shaplov)
and there will be no need to serialize reloptions to a text array element.

Example query showing how parameters are stored:

SELECT ARRAY(
     SELECT (pg_identify_object('pg_opclass'::regclass, opcid, 0)).name
     FROM unnest(indclass::int[]) opcid
   ) indclass, indoptions
FROM pg_index
WHERE indoptions IS NOT NULL;

              indclass             |             indoptions
----------------------------------+------------------------------------
  {gist__intbig_ops,gist__int_ops} | {"{siglen=32}","{numranges=100}"}
  {jsonb_path_ops}                 | {"{projection=$.tags[*].term}"}
(2 rows)


Each access method supporting opclass parameters specifies amopclassoptions
routine for transformation of text[] parameters datum into a binary bytea
structure which will be cached in RelationData and IndexOptInfo structures:

typedef bytea *(*amopclassoptions_function) (
    Relation index, AttrNumber colnum, Datum indoptions, bool validate
);


If access method wants simply to delegate parameters processing to one of
column opclass's support functions, then it can use
index_opclass_options_generic() subroutine in its amopclassoptions
implementation:

bytea *index_opclass_options_generic(
    Relation relation, AttrNumber attnum, uint16 procnum,
    Datum indoptions, bool validate
);

This support functions must have the following signature:
internal (options internal, validate bool).
Opclass parameters are passed as a text[] reloptions datum, returned pointer to
a bytea structure with parsed parameter values.

Opclass can use new functions parseLocalRelOptions(),
parseAndFillLocalRelOptions() for reloptions parsing.  This functions differ
from the standard parseRelOptions() in that a local array of reloptions
descriptions is passed here, not a global relopt_kind.  But it seems that
reloptions processing still needs deeper refactoring like the one already done
by Nikolay Shaplov (https://www.postgresql.org/message-id/flat/2146419.veIEZdk4E4%40x200m#2146419.veIEZdk4E4@x200m).


2. Opclass parameters support in GiST indices.

Parametrized GiST opclass specifies optional 10th (GIST_OPCLASSOPT_PROC)
support function with the following signature:
internal (options internal, validate bool)

Returned parsed bytea pointer with parameters will be passed to all support
functions in the last argument.

3. Opclass parameters support in GIN indices.

Everything is the same as for GiST, except for the optional support
function number which is 7 (GIN_OPCLASSOPTIONS_PROC) here.

4. Opclass parameters for GiST tsvector_ops
5. Opclass parameters for contrib/intarray
6. Opclass parameters for contrib/ltree
7. Opclass parameters for contrib/pg_trgm
8. Opclass parameters for contrib/hstore

This 5 patches for GiST opclasses are very similar: added optional 'siglen'
parameter for specifying signature length.  Default signature length is left
equal to the hardcoded value that was here before. Also added 'numranges'
parameter for gist__int_ops.



We also have two more complex unfinished patches for GIN opclasses which
should be posted in separate threads:

  * tsvector_ops: added parameter 'weights' for specification of indexed
    lexeme's weight groups.  This parameter can reduce index size and its
    build/update time and can also eliminate recheck.  By default, all weights
    are indexed within the same group.
   
  * jsonb_ops: added jsonpath parameter 'projection' for specification of
    indexed paths in jsonb (this patch depends on SQL/JSON jsonpath patch).
    Analogically to tsvector_ops, this parameter can reduce index size and its
    build/update time, but can not eliminate recheck.


-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Kazimiers
Дата:
Сообщение: Re: Unexpected behavior with transition tables in update statementtrigger
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: [HACKERS] MERGE SQL Statement for PG11