Re: ice-broker scan thread

Поиск
Список
Период
Сортировка
От Gavin Sherry
Тема Re: ice-broker scan thread
Дата
Msg-id Pine.LNX.4.58.0511291429330.18112@linuxworld.com.au
обсуждение исходный текст
Ответ на ice-broker scan thread  (Qingqing Zhou <zhouqq@cs.toronto.edu>)
Ответы Re: ice-broker scan thread  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: ice-broker scan thread  (David Boreham <david_list@boreham.org>)
Re: ice-broker scan thread  (Qingqing Zhou <zhouqq@cs.toronto.edu>)
Re: ice-broker scan thread  (Martijn van Oosterhout <kleptog@svana.org>)
Список pgsql-hackers
On Mon, 28 Nov 2005, Qingqing Zhou wrote:

>
> I am considering add an "ice-broker scan thread" to accelerate PostgreSQL
> sequential scan IO speed. The basic idea of this thread is just like the
> "read-ahead" method, but the difference is this one does not read the data
> into shared buffer pool directly, instead, it reads the data into file
> system cache, which makes the integration easy and this is unique to
> PostgreSQL.
>

MySQL, Oracle and others implement read-ahead threads to simulate async IO
'pre-fetching'. I've been experimenting with two ideas. The first is to
increase the readahead when we're doing sequential scans (see prototype
patch using posix fadvise attached). I've not got any hardware at the
moment which I can test this patch on but I am waiting on some dbt-3
results which should indicate whether fadvise is a good idea or a bad one.

The second idea is using posix async IO at key points within the system
to better parallelise CPU and IO work. There areas I think we could use
async IO are: during sequential scans, use async IO to do pre-fetching of
blocks; inside WAL, begin flushing WAL buffers to disk before we commit;
and, inside the background writer/check point process, asynchronously
write out pages and, potentially, asynchronously build new checkpoint segments.

The motivation for using async IO is two fold: first, the results of this
paper[1] are compelling; second, modern OSs support async IO. I know that
Linux[2], Solaris[3], AIX and Windows all have async IO and I presume that
all their rivals have it as well.

The fundamental premise of the paper mentioned above is that if the
database is busy, IO should be busy. With our current block-at-a-time
processing, this isn't always the case. This is why Qingqing's read-ahead
thread makes sense. My reason for mailing is, however, that the async IO
results are more compelling than the read ahead thread.

I haven't had time to prototype whether we can easily implement async IO
but I am planning to work on it in December. The two main goals will be to
a) integrate and utilise async IO, at least within the executor context,
and b) build a primitive kind of scheduler so that we stop prefetching
when we know that there are a certain number of outstanding IOs for a
given device.

Thanks,

Gavin



[1] http://www.vldb2005.org/program/paper/wed/p1116-hall.pdf
[2] http://lse.sourceforge.net/io/aionotes.txt
[3] http://developers.sun.com/solaris/articles/event_completion.html - I'm
fairly sure they have a posix AIO wrapper around these routines, but I
cannot see it documented anywhere :-(

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Boreham
Дата:
Сообщение: Re: ice-broker scan thread
Следующее
От: Christopher Kings-Lynne
Дата:
Сообщение: Re: ice-broker scan thread