Preventing free space from being reused

Поиск

Список

Период

Сортировка

От	Noah Bergbauer
Тема	Preventing free space from being reused
Дата	12 февраля 2021 г. 14:41:37
Msg-id	CABjy+RhbFu_Hs8ZEiOzaPaJSGB9jqFF0gDU5gtwCLiurG3NLjQ@mail.gmail.com обсуждение исходный текст
Ответы	Re: Preventing free space from being reused
Список	pgsql-general

Дерево обсуждения

Hello,

I am working on a system to store and analyze JSON-based log data. The idea is to store tuples with a monotonically increasing timestamp (i.e. CURRENT_TIMESTAMP) in a table and run some queries that focus on specific time windows (~one hour). It seems to me that a BRIN index is designed exactly for this type of application.

However, I find that a BRIN index with standard settings (pages_per_range = 128) with very simple filters (e.g. ts > (CURRENT_TIMESTAM - INTERVAL '1 hour')) causes ~20x more pages to be scanned than I would expect. A majority of these tuples is then discarded due to index rechecking. Inspecting the BRIN pages using the pageinspect extension reveals the following problem: it seems that if a page is left with some free space that can not be filled right away (because the tuples being inserted are too large), then this hole can be filled at some arbitrary later point in time (sometimes hours later) once a small enough tuple comes along. This substantially reduces the BRIN index's effectiveness.

I confirmed this theory by CLUSTERing the table using a temporary btree index. Suddenly the query performance exactly matched my estimates.

The JSON data structure is often similar, so the table is stored on ZFS with compression. Hence, filling these holes brings no benefit - they were taking up no space thanks to the compression. On the other hand, rewriting old pages in order to fill up holes also creates a lot of fragmentation because of ZFS's copy-on-write semantics.

In summary, all of these problems would be solved if there was some way to stop the database system from ever reusing free space.

Bonus question: what's the best TOAST configuration for this use case? Is there any point in out-of-line storage when all tuples are going to be quite large (i.e. > 1kB)? Is there any benefit in having postgresql compress the data when ZFS runs compression regardless?

Thank you,

Noah Bergbauer

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Preventing free space from being reused