Re: ANALYZE sampling is too good

Поиск
Список
Период
Сортировка
От Claudio Freire
Тема Re: ANALYZE sampling is too good
Дата
Msg-id CAGTBQpYOkmQaua6YxOuR3B5owc5LOCyU0tv4bhVVMLQU_6kFGA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: ANALYZE sampling is too good  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы Re: ANALYZE sampling is too good
Список pgsql-hackers
On Mon, Dec 9, 2013 at 6:47 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> On 12/09/2013 11:35 PM, Jim Nasby wrote:
>>
>> On 12/8/13 1:49 PM, Heikki Linnakangas wrote:
>>>
>>> On 12/08/2013 08:14 PM, Greg Stark wrote:
>>>>
>>>> The whole accounts table is 1.2GB and contains 10 million rows. As
>>>> expected with rows_per_block set to 1 it reads 240MB of that
>>>> containing nearly 2 million rows (and takes nearly 20s -- doing a full
>>>> table scan for select count(*) only takes about 5s):
>>>
>>>
>>> One simple thing we could do, without or in addition to changing the
>>> algorithm, is to issue posix_fadvise() calls for the blocks we're
>>> going to read. It should at least be possible to match the speed of a
>>> plain sequential scan that way.
>>
>>
>> Hrm... maybe it wouldn't be very hard to use async IO here either? I'm
>> thinking it wouldn't be very hard to do the stage 2 work in the callback
>> routine...
>
>
> Yeah, other than the fact we have no infrastructure to do asynchronous I/O
> anywhere in the backend. If we had that, then we could easily use it here. I
> doubt it would be much better than posix_fadvising the blocks, though.


Without patches to the kernel, it is much better.

posix_fadvise interferes with read-ahead, so posix_fadvise on, say,
bitmap heap scans (or similarly sorted analyze block samples) run at 1
IO / block, ie horrible, whereas aio can do read coalescence and
read-ahead when the kernel thinks it'll be profitable, significantly
increasing IOPS. I've seen everything from a 2x to 10x difference.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nigel Heron
Дата:
Сообщение: Re: stats for network traffic WIP
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: plpgsql_check_function - rebase for 9.3