Re: Serious performance problem
От | Hannu Krosing |
---|---|
Тема | Re: Serious performance problem |
Дата | |
Msg-id | 3BDE7F09.12A1CC4C@tm.ee обсуждение исходный текст |
Ответ на | Re: Serious performance problem ("Tille, Andreas" <TilleA@rki.de>) |
Список | pgsql-hackers |
"Tille, Andreas" wrote: > > On Mon, 29 Oct 2001, Ross J. Reedstrom wrote: > > > Here's what I see: Your example touches on what can be an achilles > > heel for pgsql's current statistical analyzer: selection on data fields > > that have a few common values. Often, the indices don't get used, since > > a large fraction of the table needs to be scanned, in any case. In > > your example, fully 68% of the table fits the where condition. > > ... > > > > I think we have a winner. No it's not sub-second, but I improved the time > > by 3x just by trying some indices. Note that I _still_ had to force the > > use of indices for this one. It's also the first time I've personally seen > > a query/dataset that benefits this much from a two-key index. > This is true for this example and I also played with indices as you. I also > enforced the index scan and compared with forbidding the index scan. The > result was on my more realistic examples that both versions performed quite > the same. There was no *real* difference. For sure in this simple query there > is a difference but the real examples showed only 2% - 5% speed increase > (if not slower with enforcing index scans!). I studied his dataset and found that a simple count(*) on whole table took 1.3 sec on my Celeron 375 so I'm sure that the more complex query, which has to visit 2/3 of tuples will not be able to execute under 1 sec My playing with indexes / subqueries and query rewriting got the example query (actually a functional equivalent) to run in ~5 sec with simple aggregate(group(indexscan))) plan and I suspect that this is how fast it will be on my hardware It could probably be soon possible to make it run in ~ 1.5 by using an aggregate function that does a sequential scan and returns a rowset. > > As another poster replied to you, there is limitation with postgresql's > > use of indices that arises from MVCC: even if the only data requested is > > that stored in the index itself, the backend must visit the actual tuple > > in the table to ensure that it is 'visible' to the current transaction. > Any possibility to switch of this temporarily for certain queries like this > if the programmer could make sure that it is not necessary? Just a stupid > idea from a bloody uneducated man in database-engeniering. There have been plans to set aside a bit in index that would mark the deleted tuple. Unfortunately this helps only in cases when there are many deleted tuples and all live tuples have to be checked anyway ;( -------------- Hannu
В списке pgsql-hackers по дате отправления: