Re: Improving N-Distinct estimation by ANALYZE

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: Improving N-Distinct estimation by ANALYZE
Дата
Msg-id 874q4c230p.fsf@stark.xeocode.com
обсуждение исходный текст
Ответ на Re: Improving N-Distinct estimation by ANALYZE  (Greg Stark <gsstark@mit.edu>)
Ответы Re: Improving N-Distinct estimation by ANALYZE  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers
Greg Stark <gsstark@MIT.EDU> writes:

> Well my theory was sort of half right. It has nothing to do with fooling Linux
> into thinking it's a sequential read. Apparently this filesystem was created
> with 32k blocks. I don't remember if that was intentional or if ext2/3 did it
> automatically based on the size of the filesystem.
> 
> So it doesn't have wide-ranging implications for Postgres's default 8k block
> size. But it is a good lesson about the importance of not using a larger
> filesystem block than Postgres's block size. The net effect is that if the
> filesystem block is N*8k then your random_page_cost goes up by a factor of N.
> That could be devastating for OLTP performance.

Hm, apparently I spoke too soon. tune2fs says the block size is in fact 4k.
Yet the performance of the block reading test program with 4k or 8k blocks
behaves as if Linux is reading 32k blocks. And in fact when I run it with 32k
blocks I get kind of behaviour we were expecting where the breakeven point is
around 20%.

So it's not the 8k block reading that's fooling Linux into reading ahead 32k.
It seems 32k readahead is the default for Linux, or perhaps it's the
sequential access pattern that's triggering it.

I'm trying to test it with posix_fadvise() set to random access but I'm having
trouble compiling. Do I need a special #define to get posix_fadvise from
glibc?

-- 
greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: plperl vs LC_COLLATE (was Re: Possible savepoint bug)
Следующее
От: Peter Eisentraut
Дата:
Сообщение: current_setting returns 'unset'