Issue with "start another primitive scan" logic during nbtree array advancement

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Issue with "start another primitive scan" logic during nbtree array advancement
Дата
Msg-id CAH2-Wz=DyHbcg7o6zXqzyiin8WE8vzk4tvU8Lrnh-a=EAvO0TQ@mail.gmail.com
обсуждение исходный текст
Список pgsql-hackers
While working on skip scan, I stumbled upon a bug on HEAD. This is an
issue in my commit 5bf748b8, "Enhance nbtree ScalarArrayOp execution".
The attached test case (repro_wrong_prim.sql) causes an assertion
failure on HEAD. Here's the stack trace:

TRAP: failed Assert("so->keyData[opsktrig].sk_strategy !=
BTEqualStrategyNumber"), File:
"../source/src/backend/access/nbtree/nbtutils.c", Line: 2475, PID:
1765589
[0x55942a24db8f] _bt_advance_array_keys:
/mnt/nvme/postgresql/patch/build_meson_dc/../source/src/backend/access/nbtree/nbtutils.c:2475
[0x55942a24bf22] _bt_checkkeys:
/mnt/nvme/postgresql/patch/build_meson_dc/../source/src/backend/access/nbtree/nbtutils.c:3797
[0x55942a244160] _bt_readpage:
/mnt/nvme/postgresql/patch/build_meson_dc/../source/src/backend/access/nbtree/nbtsearch.c:2221
[0x55942a2434ca] _bt_first:
/mnt/nvme/postgresql/patch/build_meson_dc/../source/src/backend/access/nbtree/nbtsearch.c:1888
[0x55942a23ef88] btgettuple:
/mnt/nvme/postgresql/patch/build_meson_dc/../source/src/backend/access/nbtree/nbtree.c:259

The problem is that _bt_advance_array_keys() doesn't take sufficient
care at the point where it decides whether its call to
_bt_check_compare against finaltup (with the scan direction flipped
around) indicates that another primitive index scan is required. The
final decision is conditioned on rules about how the scan key offset
sktrig that triggered the call to _bt_advance_array_keys() relates to
the scan key offset that was set by the _bt_check_compare finaltup
comparison. This was fragile. It breaks with this test case because of
fairly subtle conditions around when and how the arrays advance, the
layout of the relevant leaf page, and the placement of inequality scan
keys.

When assertions are disabled, we do multiple primitive index scans
that land on the same leaf page, which isn't supposed to be possible
anymore. The query gives correct answers, but this behavior is
definitely wrong (it is simply supposed to be impossible now, per
5bf748b8's commit message).

Attached is a draft bug fix patch. It nails down the test by simply
testing "so->keyData[opsktrig].sk_strategy != BTEqualStrategyNumber"
directly, rather than comparing scan key offsets. This is a far
simpler and far more direct approach.

You might wonder why I didn't do it like this in the first place. It
just worked out that way. The code in question was written before I
changed the design of _bt_check_compare (in the draft patch that
became commit 5bf748b8). Up until not that long before the patch was
committed, _bt_check_compare would set "continuescan=false" for
non-required arrays. That factor made detecting whether or not the
relevant _bt_check_compare call had in fact encountered a required
inequality of the kind we need to detect (to decide on whether to
start another primitive index scan) difficult and messy. However, the
final committed patch simplified _bt_check_compare, making the
approach I've taken in the bug fix patch possible. I just never made
the connection before now.

-- 
Peter Geoghegan

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Parallel CREATE INDEX for GIN indexes
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: pg_combinebackup --clone doesn't work