Re: Serious performance problem

Поиск
Список
Период
Сортировка
От Hannu Krosing
Тема Re: Serious performance problem
Дата
Msg-id 3BDE7F09.12A1CC4C@tm.ee
обсуждение исходный текст
Ответ на Re: Serious performance problem  ("Tille, Andreas" <TilleA@rki.de>)
Список pgsql-hackers
"Tille, Andreas" wrote:
> 
> On Mon, 29 Oct 2001, Ross J. Reedstrom wrote:
> 
> > Here's what I see: Your example touches on what can be an achilles
> > heel for pgsql's current statistical analyzer: selection on data fields
> > that have a few common values. Often, the indices don't get used, since
> > a large fraction of the table needs to be scanned, in any case. In
> > your example, fully 68% of the table fits the where condition.
> >  ...
> >
> > I think we have a winner. No it's not sub-second, but I improved the time
> > by 3x just by trying some indices. Note that I _still_ had to force the
> > use of indices for this one. It's also the first time I've personally seen
> > a query/dataset that benefits this much from a two-key index.
> This is true for this example and I also played with indices as you.  I also
> enforced the index scan and compared with forbidding the index scan.  The
> result was on my more realistic examples that both versions performed quite
> the same.  There was no *real* difference. For sure in this simple query there
> is a difference but the real examples showed only 2% - 5% speed increase
> (if not slower with enforcing index scans!).

I studied his dataset and found that a simple count(*) on whole table 
took 1.3 sec on my Celeron 375 so I'm sure that the more complex query, 
which has to visit 2/3 of tuples will not be able to execute under 1 sec

My playing with indexes / subqueries and query rewriting got the example 
query (actually a functional equivalent) to run in ~5 sec with simple 
aggregate(group(indexscan))) plan and I suspect that this is how fast 
it will be on my hardware

It could probably be soon possible to make it run in ~ 1.5 by using an
aggregate 
function that does a sequential scan and returns a rowset.

> > As another poster replied to you, there is limitation with postgresql's
> > use of indices that arises from MVCC: even if the only data requested is
> > that stored in the index itself, the backend must visit the actual tuple
> > in the table to ensure that it is 'visible' to the current transaction.
> Any possibility to switch of this temporarily for certain queries like this
> if the programmer could make sure that it is not necessary?  Just a stupid
> idea from a bloody uneducated man in database-engeniering.

There have been plans to set aside a bit in index that would mark the
deleted 
tuple. Unfortunately this helps only in cases when there are many
deleted tuples
and all live tuples have to be checked anyway ;(

--------------
Hannu


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dave Page
Дата:
Сообщение: Re: DROP/CREATE
Следующее
От: "Tille, Andreas"
Дата:
Сообщение: Re: Serious performance problem