Re: [PERFORMANCE] expanding to SAN: which portion best to move

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [PERFORMANCE] expanding to SAN: which portion best to move
Дата
Msg-id BANLkTi=Oo4jRHpBY_74nJrcryr=BLDZR9A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PERFORMANCE] expanding to SAN: which portion best to move  (Willy-Bas Loos <willybas@gmail.com>)
Ответы Re: [PERFORMANCE] expanding to SAN: which portion best to move
Список pgsql-performance
On Wed, May 4, 2011 at 6:31 AM, Willy-Bas Loos <willybas@gmail.com> wrote:
> I'm asking them for (real) benchmarks, thanks for the advice. (fio is not
> available for us now to do it myself, grmbl)
> It just occurred to me that it is not necessarily the case that reading the
> indexes causes a lot of random I/O (on the index itself).
> I mean, maybe the index is generally read sequentially and then, when
> retrieving the data, there is a lot of random I/O.
> if it's a long story, any tips for info about this (book or web site)?

If you don't do anything special, and if the query plan says "Index
Scan" rather than "Bitmap Index Scan", then both the index I/O and the
table I/O are likely to be fairly random.  However there are a number
of cases in which you can expect the table I/O to be sequential:

- In some cases, you may happen to insert rows with an ordering that
matches the index.  For example, if you have a table with not too many
updates and deletes, and an index on a serial column, then new rows
will have a higher value in that column than old rows, and will also
typically be physically after older rows in the file.  Or you might be
inserting timestamped data from oldest to newest.
- If the planner chooses a Bitmap Index Scan, it effectively scans the
index to figure out which table blocks to read, and then reads those
table blocks in block number order, so that the I/O is sequential,
with skips.
- If you CLUSTER the table on a particular index, it will be
physically ordered to match the index's key ordering.  As the table is
further modified the degree of clustering will gradually decline;
eventually you may wish to re-CLUSTER.

It's also worth keeping in mind that the index itself won't
necessarily be accessed in physically sequential order.  The point of
the index is to emit the rows in key order, but if the table is
heavily updated, it won't necessarily be the case that a page
containing lower-valued keys physically precedes a page containing
higher-valued keys.  I'm actually somewhat fuzzy on how this works,
and to what extent it's a problem in practice, but I am fairly sure it
can happen.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Kevin Grittner"
Дата:
Сообщение: Re: reducing random_page_cost from 4 to 2 to force index scan
Следующее
От: Tom Lane
Дата:
Сообщение: Re: reducing random_page_cost from 4 to 2 to force index scan