Re: index prefetching

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: index prefetching
Дата	14 августа 02:22:28
Msg-id	ffd81af3-a2dd-4f41-b2be-7547cd61c6e3@vondra.me обсуждение исходный текст
Ответ на	Re: index prefetching (Peter Geoghegan <pg@bowt.ie>)
Список	pgsql-hackers

Дерево обсуждения

On 8/13/25 23:36, Peter Geoghegan wrote:
> On Wed, Aug 13, 2025 at 1:01 PM Tomas Vondra <tomas@vondra.me> wrote:
>> This seems rather bizarre, considering the two tables are exactly the
>> same, except that in t2 the first column is negative, and the rows are
>> fixed-length. Even heap_page_items says the tables are exactly the same.
>>
>> So why would the index get so different like this?
> 
> In the past, when I required *perfectly* deterministic results for
> INSERT INTO test_table ... SELECT * FROM source_table bulk inserts
> (which was important during the Postgres 12 and 13 nbtree work), I
> found it necessary to "set synchronize_seqscans=off". If I was writing
> a test such as this, I'd probably do that defensively, even if it
> wasn't clear that it mattered. (I'm also in the habit of using
> unlogged tables, because VACUUM tends to set their pages all-visible
> more reliably than equivalent logged tables, which I notice that
> you're also doing here.)
> 

The tables are *exactly* the same, block by block. I double checked that
by looking at a couple pages, and the only difference is the inverted
value of the "a" column.

> That said, I *think* that the "locally shuffled" heap TID pattern that
> we see with "t2"/"idx2" is mostly (perhaps entirely) caused by the way
> that you're inverting the indexed column's value when initially
> generating "t2". A given range of values such as "1 through to 4"
> becomes "-4 through to -1" as their tuples are inserted into t2.

Right.

> You're effectively inverting the order of the bigint indexed column
> "a" -- but you're *not* inverting the order of the imaginary
> tie-breaker heap column (it *remains* in ASC heap TID order in "t2").
> 

I have no idea what I'm supposed to do about that. As you say the
tie-breaker is imaginary, selected by the system on my behalf. If it
works like this, doesn't that mean it'll have this unfortunate effect on
all data sets with negative correlation?

> In general, when doing this sort of analysis, I find it useful to
> manually verify that the data that I generated matches my
> expectations. Usually a quick check with pageinspect is enough. I'll
> just randomly select 2 - 3 leaf pages, and make sure that they all
> more or less match my expectations.
> 

I did that for the heap, and that's just as I expected. But the effect
on the index surprised me.


regards

-- 
Tomas Vondra

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: index prefetching