Re: GiST index performance

От: Matthew Wakeling
Тема: Re: GiST index performance
Дата: ,
Msg-id: alpine.DEB.2.00.1003221411081.9798@aragorn.flymine.org
(см: обсуждение, исходный текст)
Ответ на: Re: GiST index performance  (Yeb Havinga)
Список: pgsql-performance

Скрыть дерево обсуждения

GiST index performance  (Matthew Wakeling, )
 Re: GiST index performance  (Robert Haas, )
  Re: GiST index performance  (Matthew Wakeling, )
   Re: GiST index performance  (Tom Lane, )
    Re: GiST index performance  (Matthew Wakeling, )
     Re: GiST index performance  (Tom Lane, )
      Re: GiST index performance  (Matthew Wakeling, )
       Re: GiST index performance  (Tom Lane, )
        Re: GiST index performance  (Matthew Wakeling, )
         Re: GiST index performance  (Matthew Wakeling, )
          Re: GiST index performance  (Matthew Wakeling, )
           Re: GiST index performance  (Tom Lane, )
           Re: GiST index performance  (Adam Gundy, )
        Re: GiST index performance  (Heikki Linnakangas, )
      Re: GiST index performance  (Greg Smith, )
       Re: GiST index performance  (Robert Haas, )
 Re: GiST index performance  (Bruce Momjian, )
  Re: GiST index performance  (Robert Haas, )
  Re: GiST index performance  (Matthew Wakeling, )
   Re: GiST index performance  (Robert Haas, )
   Re: GiST index performance  (Yeb Havinga, )
    Re: GiST index performance  (Yeb Havinga, )
     Re: GiST index performance  (Yeb Havinga, )
      Re: GiST index performance  (Yeb Havinga, )
       Re: GiST index performance  (Yeb Havinga, )
        Re: GiST index performance  (Kenneth Marshall, )
         Re: GiST index performance  (Yeb Havinga, )
          Re: GiST index performance  (Matthew Wakeling, )
           Re: GiST index performance  (Yeb Havinga, )
            Re: GiST index performance  (Matthew Wakeling, )

On Mon, 22 Mar 2010, Yeb Havinga wrote:
>> Yes, that is certainly a factor. For example, the page size for bioseg
>> which we use here is 130 entries, which is very excessive, and doesn't
>> allow very deep trees. On the other hand, it means that a single disc seek
>> performs quite a lot of work.

> Yeah, I only did in-memory fitting tests and wondered about increased io's.
> However I bet that even for bigger than ram db's, the benefit of having to
> fan out to less pages still outweighs the over-general non leaf nodes and
> might still result in less disk io's. I redid some earlier benchmarking with
> other datatypes with a 1kB block size and also multicolumn gist and the
> multicolumn variant had an ever greater benefit than the single column
> indexes, both equality and range scans. (Like execution times down to 20% of
> original). If gist is important to you, I really recommend doing a test with
> 1kB blocks.

Purely from a disc seek count point of view, assuming an infinite CPU
speed and infinite disc transfer rate, the larger the index pages the
better. The number of seeks per fetch will be equivalent to the depth of
the tree.

If you take disc transfer rate into account, the break-even point is when
you spend an equal time transferring as seeking, which places the page
size around 500kB on a modern disc, assuming RAID stripe alignment doesn't
make that into two seeks instead of one.

However, for efficient CPU usage, the ideal page size for a tree index is
much smaller - between two and ten entries, depending on the type of the
data.

There may be some mileage in reorganising indexes into a two-level system.
That is, have an index format where the page size is 512kB or similar, but
each page is internally a CPU-efficient tree itself.

However, this is beyond the scope of the problem of speeding up gist.

Matthew

--
 If you let your happiness depend upon how somebody else feels about you,
 now you have to control how somebody else feels about you. -- Abraham Hicks


В списке pgsql-performance по дате сообщения:

От: Tom Lane
Дата:
Сообщение: Re: PostgreSQL upgraded to 8.2 but forcing index scan on query produces faster
От: "Pierre C"
Дата:
Сообщение: Re: Block at a time ...