Re: Bitmap table scan cost per page formula

Поиск

Список

Период

Сортировка

От	Jeff Janes
Тема	Re: Bitmap table scan cost per page formula
Дата	21 декабря 2017 г. 10:51:04
Msg-id	CAMkU=1xaLs8RnWrY_jLAc_ON--5sT0-QCEbBcQCjR5oxMHc4tg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Bitmap table scan cost per page formula (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: Bitmap table scan cost per page formula ("Tels" <nospam-pg-abuse@bloodgate.com>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Dec 20, 2017 at 2:18 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Dec 20, 2017 at 4:20 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
It is not obvious to me that the parabola is wrong. I've certainly seen cases where reading every 2nd or 3rd block (either stochastically, or modulus) actually does take longer than reading every block, because it defeats read-ahead. But it depends on a lot on your kernel version and your kernel settings and your file system and probably other things as well.

Well, that's an interesting point, too. Maybe we need another graph that also shows the actual runtime of a bitmap scan and a sequential scan.

I've did some low level IO benchmarking, and I actually get 13 times slower to read every 3rd block than every block using CentOS6.9 with ext4 and the setting:

blockdev --setra 8192 /dev/sdb1

On some virtualized storage which I don't know the details of, but it behaves as if it were RAID/JBOD with around 6 independent spindles..

If I pick the 1/3 of the blocks to read stochastically rather than by modulus, it is only 2 times slower than reading all of them, I assume because having occasional reads which are adjacent to each other does make read-ahead kick in, while evenly spaced never-adjacent reads does not. This is probably a better model of how bitmap table scans actually work, as there is no general reason to think they would be evenly spaced and non-adjacent. So this result is in reasonable agreement with how the current cost estimation works, the parabola peaks at about twice the cost as the sequence scan.

I used a file of about 10GB, because I happened to have one laying around.

## read every block ($_%3>5 is never true)

sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"

time perl -le 'open my $fh, "rand" or die; foreach (1..1300000) {$x=""; next if $_%3>5; sysseek $fh,$_*8*1024,0 or die $!; sysread $fh, $x,8*1024; print length $x} '|uniq -c

1295683 8192

4317 0

real 0m38.329s

## read every 3rd block

sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"

time perl -le 'open my $fh, "rand" or die; foreach (1..1300000) {$x=""; next if $_%3>0; sysseek $fh,$_*8*1024,0 or die $!; sysread $fh, $x,8*1024; print length $x} '|uniq -c

431894 8192

1439 0

real 8m54.698s

time perl -wle 'open my $fh, "rand" or die; foreach (1..1300000) {$x=""; next if rand()>0.3333; sysseek $fh,$_*8*1024,0 or die $!; sysread $fh, $x,8*1024; print length $x} '|uniq -c

431710 8192

1434 0

real 1m23.130s

Dropping the caches is a reasonable way to do this type of benchmark, because it simulates what would happen if your data set was much larger than RAM, without needing to actually use a data set much larger than RAM.

It would be interesting to see what other people get for similar low level tests, as well actual bitmap scans.

Cheers,

Jeff

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Michael Paquier
Дата: 21 декабря 2017 г., 10:32:35
Сообщение: Re: Add hint about replication slots when nearing wraparound

Следующее

От: Andrey Borodin
Дата: 21 декабря 2017 г., 11:42:27
Сообщение: Re: GSoC 2018

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Bitmap table scan cost per page formula

Предыдущее

Следующее