Re: GiST index performance

От: Matthew Wakeling
Тема: Re: GiST index performance
Дата: ,
Msg-id: alpine.DEB.2.00.0904171809040.22330@aragorn.flymine.org
(см: обсуждение, исходный текст)
Ответ на: Re: GiST index performance  (Tom Lane)
Ответы: Re: GiST index performance  (Matthew Wakeling)
Список: pgsql-performance

Скрыть дерево обсуждения

GiST index performance  (Matthew Wakeling, )
 Re: GiST index performance  ("Kevin Grittner", )
  Re: GiST index performance  (Matthew Wakeling, )
  Re: GiST index performance  (Tom Lane, )
   Re: GiST index performance  (Matthew Wakeling, )
    Re: GiST index performance  (Tom Lane, )
     Re: GiST index performance  (Matthew Wakeling, )
   Re: GiST index performance  (Matthew Wakeling, )
    Re: GiST index performance  (Matthew Wakeling, )
     Re: GiST index performance  (Tom Lane, )
 Re: GiST index performance  (Matthew Wakeling, )
 Re: GiST index performance  (dforum, )
  Re: GiST index performance  (Tom Lane, )
  Re: GiST index performance  (Craig Ringer, )
 Re: GiST index performance  (Matthew Wakeling, )
  Re: GiST index performance  (Matthew Wakeling, )
   Re: GiST index performance  (Matthew Wakeling, )
    Re: GiST index performance  (Tom Lane, )
     Re: GiST index performance  (Oleg Bartunov, )
 Re: GiST index performance  (Matthew Wakeling, )
  Re: GiST index performance  (Bruce Momjian, )
   Re: GiST index performance  (Robert Haas, )
    Re: GiST index performance  (Bruce Momjian, )

On Thu, 16 Apr 2009, Tom Lane wrote:
> Matthew, can you put together a self-contained test case with a similar
> slowdown?

I have done a bit of investigation, and I think I might have found the
smoking gun I was looking for. I just added a load of debug to the gist
consistent function on the bioseg type, and did a single overlap lookup in
the index.

The index contains values ranging from 1 to 28,000,000 or so.
The range I looked up was 23503297..23504738 (so a very small proportion).
The index contains 375154 entries.
The index returned 59 rows.
The consistent method was called 54022 times - 5828 times for branch
     (internal) index entries, and 48194 times for leaf entries.

Obviously this is a really bad index layout - scanning that many entries
for such a small output. In fact, I saw lots of overlapping branch index
entries, so the index isn't actually differentiating between the different
branches of the tree very well. This indicates a failure of the picksplit
or the penalty functions. I shall investigate this further next week.

I shall also investigate whether this is the exact same problem that I had
with the int4 gist system.

Matthew

--
So, given 'D' is undeclared too, with a default of zero, C++ is equal to D.
  mnw21, commenting on the "Surely the value of C++ is zero, but C is now 1"
  response to "No, C++ isn't equal to D. 'C' is undeclared [...] C++ should
  really be called 1" response to "C++ -- shouldn't it be called D?"


В списке pgsql-performance по дате сообщения:

От: Grzegorz Jaśkiewicz
Дата:
Сообщение: stats are way off on 8.4 b1
От: Tom Lane
Дата:
Сообщение: Re: stats are way off on 8.4 b1