Re: index / sequential scan problem

От: Tom Lane
Тема: Re: index / sequential scan problem
Дата: ,
Msg-id: 5586.1058465575@sss.pgh.pa.us
(см: обсуждение, исходный текст)
Ответ на: Re: index / sequential scan problem  (Fabian Kreitner)
Ответы: Re: index / sequential scan problem  (Fabian Kreitner)
Список: pgsql-performance

Скрыть дерево обсуждения

index / sequential scan problem  (Fabian Kreitner, )
 Re: index / sequential scan problem  ("Shridhar Daithankar", )
  Re: index / sequential scan problem  (Fabian Kreitner, )
   Re: index / sequential scan problem  ("Shridhar Daithankar", )
 Re: index / sequential scan problem  (Paul Thomas, )
  Re: index / sequential scan problem  (Fabian Kreitner, )
   Re: index / sequential scan problem  (Paul Thomas, )
    Re: index / sequential scan problem  (Fabian Kreitner, )
     Re: index / sequential scan problem  ("Shridhar Daithankar", )
     Re: index / sequential scan problem  (Jord Tanner, )
     Re: index / sequential scan problem  (Paul Thomas, )
      Re: index / sequential scan problem  (Tom Lane, )
   Re: index / sequential scan problem  (Tom Lane, )
    Re: index / sequential scan problem  (Fabian Kreitner, )
     Re: index / sequential scan problem  (Fabian Kreitner, )
      Re: index / sequential scan problem  (Dennis Björklund, )
       Re: index / sequential scan problem  (Tom Lane, )
        Re: index / sequential scan problem  ("scott.marlowe", )
        Re: index / sequential scan problem  (Dennis Björklund, )

Fabian Kreitner <> writes:
> That is what I read too and is why Im confused that the index is indeed
> executing faster. Can this be a problem with the hardware and/or postgress
> installation?

I think the actual issue here is that you are executing the EXISTS
subplan over and over, once for each outer row.  The planner's cost
estimate for EXISTS is based on the assumption that you do it once
... in which scenario the seqscan very possibly is cheaper.  However,
when you do the EXISTS subplan over and over for many outer rows, you
get a savings from the fact that the index and table pages soon get
cached in memory.  The seqscan plan gets a savings too, since the table
is small enough to fit in memory, but once everything is in memory the
indexscan plan is faster.

There's been some discussion on pghackers about how to teach the planner
to account for repeated executions of subplans, but we have not come up
with a good solution yet.

For the moment, what people tend to do if they know their database is
small enough to mostly stay in memory is to reduce random_page_cost to
make the planner favor indexscans.  If you know the database is entirely
cached then the theoretically correct value of random_page_cost is 1.0
(since fetching any page will cost the same, if it's all in RAM).  I'd
recommend against adopting that as a default, but a lot of people find
that setting it to 2.0 or so seems to model their situation better than
the out-of-the-box 4.0.

            regards, tom lane


В списке pgsql-performance по дате сообщения:

От: Josh Berkus
Дата:
Сообщение: Table clustering -- useful, or not?
От: Robert Creager
Дата:
Сообщение: Re: Hardware performance