Re: Estimates on partial index

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: Estimates on partial index
Дата
Msg-id CAMkU=1zyFp3JCFsj26J7suTfA1OJ-+VxSwf1q1FpicG=E7Eubg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Estimates on partial index  (Victor Yegorov <vyegorov@gmail.com>)
Ответы Re: Estimates on partial index  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Re: Estimates on partial index  (Victor Yegorov <vyegorov@gmail.com>)
Список pgsql-performance
On Thu, Aug 18, 2016 at 11:55 AM, Victor Yegorov <vyegorov@gmail.com> wrote:
> 2016-08-18 18:59 GMT+03:00 Jeff Janes <jeff.janes@gmail.com>:
>>
>> Both plans touch the same pages.  The index scan just touches some of
>> those pages over and over again.  A large setting of
>> effective_cache_size would tell it that the page will most likely
>> still be in cache when it comes back to touch it again, meaning the
>> cost of doing so will be small, basically free.
>>
>> > and in a typical situation those are cold.
>>
>> But they won't be, because it is heating them up itself, and
>> effective_cache_size says that stay then hot for the duration of the
>> query.
>
>
> But IndexScan means, that not only index, table is also accessed.
> And although index is small get's hot quite quickly (yes, e_c_s is 96GB on
> this dedicated box),
> table is not.

Both types of scans have to touch the same set of pages.  The bitmap
hits all of the needed index pages first and memorizes the relevant
results, then hits all the needed table pages.  The regular index scan
keeps jumping back and forth from index to table. But they are the
same set of pages either way.

With a regular index scan, if the same table page is pointed to from
40 different places in the index, then it will be touched 40 different
times.  But at least 39 of those times it is going to already be in
memory.  The bitmap scan will touch the page just one and deal with
all 40 entries.


>  And this clearly adds up to the total time.

That isn't clear at all from the info you gave.  You would have to set
track_io_timing=on in order to show something like that.  And we don't
know if you ran each query once in the order shown, and posted what
you got (with one warming the cache for the other); or if you have ran
each repeatedly and posted representative examples with a pre-warmed
cache.


> I am wondering, if heap page accesses are also accounted for during
> planning.

It does account for them, but perhaps not perfectly.  See "[PERFORM]
index fragmentation on insert-only table with non-unique column" for
some arguments on that which might be relevant to you.

If you can come up with a data generator which creates data that
others can use to reproduce this situation, we can then investigate it
in more detail.

Cheers,

Jeff


В списке pgsql-performance по дате отправления:

Предыдущее
От: Victor Yegorov
Дата:
Сообщение: Re: Estimates on partial index
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: Big data INSERT optimization - ExclusiveLock on extension of the table