Re: index prefetching
От | Tomas Vondra |
---|---|
Тема | Re: index prefetching |
Дата | |
Msg-id | ffd81af3-a2dd-4f41-b2be-7547cd61c6e3@vondra.me обсуждение исходный текст |
Ответ на | Re: index prefetching (Peter Geoghegan <pg@bowt.ie>) |
Список | pgsql-hackers |
On 8/13/25 23:36, Peter Geoghegan wrote: > On Wed, Aug 13, 2025 at 1:01 PM Tomas Vondra <tomas@vondra.me> wrote: >> This seems rather bizarre, considering the two tables are exactly the >> same, except that in t2 the first column is negative, and the rows are >> fixed-length. Even heap_page_items says the tables are exactly the same. >> >> So why would the index get so different like this? > > In the past, when I required *perfectly* deterministic results for > INSERT INTO test_table ... SELECT * FROM source_table bulk inserts > (which was important during the Postgres 12 and 13 nbtree work), I > found it necessary to "set synchronize_seqscans=off". If I was writing > a test such as this, I'd probably do that defensively, even if it > wasn't clear that it mattered. (I'm also in the habit of using > unlogged tables, because VACUUM tends to set their pages all-visible > more reliably than equivalent logged tables, which I notice that > you're also doing here.) > The tables are *exactly* the same, block by block. I double checked that by looking at a couple pages, and the only difference is the inverted value of the "a" column. > That said, I *think* that the "locally shuffled" heap TID pattern that > we see with "t2"/"idx2" is mostly (perhaps entirely) caused by the way > that you're inverting the indexed column's value when initially > generating "t2". A given range of values such as "1 through to 4" > becomes "-4 through to -1" as their tuples are inserted into t2. Right. > You're effectively inverting the order of the bigint indexed column > "a" -- but you're *not* inverting the order of the imaginary > tie-breaker heap column (it *remains* in ASC heap TID order in "t2"). > I have no idea what I'm supposed to do about that. As you say the tie-breaker is imaginary, selected by the system on my behalf. If it works like this, doesn't that mean it'll have this unfortunate effect on all data sets with negative correlation? > In general, when doing this sort of analysis, I find it useful to > manually verify that the data that I generated matches my > expectations. Usually a quick check with pageinspect is enough. I'll > just randomly select 2 - 3 leaf pages, and make sure that they all > more or less match my expectations. > I did that for the heap, and that's just as I expected. But the effect on the index surprised me. regards -- Tomas Vondra
В списке pgsql-hackers по дате отправления: