Обсуждение: Please improve "Limitations" section for GIN indexes

Поиск
Список
Период
Сортировка

Please improve "Limitations" section for GIN indexes

От
Marcin Barczynski
Дата:
We've learned the hard way that keys are never removed in GIN indexes, leading to infinite index size growth in certain scenarios. The PostgreSQL documentation doesn't mention it even in the "Limitations" section for GIN indexes (https://www.postgresql.org/docs/15/gin-limit.html).

The only place I found info about the unusual behavior is README file in the source code: https://github.com/postgres/postgres/blob/master/src/backend/access/gin/README#L391.


We keep a tree-like structure in the database and use GIN indexes to find descendants of a node quickly. The simplified table looks as follows:

node_id | ancestor_ids
--------|-------------
 1      | []
 2      | [1]
 15     | [1, 2]

node_id is a sequence, and nodes are often added and removed from the table: today node_ids are in the range from 1 to 100K but next month it will be 500K-600K. Because GIN index never removes entry keys, it will contain keys from 1 to 600K. In our case the GIN index became 2.5 times larger than the table - that's the hard-way part.

How about sparing others the hard-way part and explicitly mentioning the unusual behavior in the "Limitations" section for GIN indexes?

Best regards,
Marcin Barczyński