Re: Serious performance problem
От | Ross J. Reedstrom |
---|---|
Тема | Re: Serious performance problem |
Дата | |
Msg-id | 20011029113154.B21886@rice.edu обсуждение исходный текст |
Ответ на | Serious performance problem ("Tille, Andreas" <TilleA@rki.de>) |
Ответы |
Re: Serious performance problem
("Tille, Andreas" <TilleA@rki.de>)
|
Список | pgsql-hackers |
Andreas - I took a look at your problem, since I'm sort of in the field, and would liek to see free solutions spread, as well. Here's what I see: Your example touches on what can be an achilles heel for pgsql's current statistical analyzer: selection on data fields that have a few common values. Often, the indices don't get used, since a large fraction of the table needs to be scanned, in any case. In your example, fully 68% of the table fits the where condition. Here's some timing results on my machine: Your dataset and query, as written: real 0m25.272s user 0m0.090s sys 0m0.050s Creating an index on meldekategorie, and forcing it's use with "set enable_seqscan = off" real 0m14.743s user 0m0.070s sys 0m0.050s Same, with index on istaktuell: real 0m26.511s user 0m0.050s sys 0m0.060s Now, with an index on both meldekategorie and istaktuell: real 0m7.179s user 0m0.060s sys 0m0.030s I think we have a winner. No it's not sub-second, but I improved the time by 3x just by trying some indices. Note that I _still_ had to force the use of indices for this one. It's also the first time I've personally seen a query/dataset that benefits this much from a two-key index. As another poster replied to you, there is limitation with postgresql's use of indices that arises from MVCC: even if the only data requested is that stored in the index itself, the backend must visit the actual tuple in the table to ensure that it is 'visible' to the current transaction. How realistic a representation of your real workload is this query? Realize that more selective, complex queries are where pgsql shines compared to other RDBMS: the 'fast table scanner' type query that you proposed as your test don't really let pgsql stretch it's legs. Do you have example timings from MS-SQL or others? Ross On Mon, Oct 29, 2001 at 01:43:37PM +0100, Tille, Andreas wrote: > Hello, > > I discussed a problem concerning the speed of PostgreSQL compared to > MS SQL server heavily on postgres-general list. The thread starts with > message > > http://fts.postgresql.org/db/mw/msg.html?mid=1035557 > > Now I tried a snapshot of version 7.2 and got an increase of speed of > about factor 2. But sorry this is really not enough. The very simple > test I pointed to in my mail is even much to slow and the issue would > probably spoil down the whole project which should be a complete open > source solution and would perhaps and in any M$ stuff. I?ve got under > heavy preasur from my employer who was talking about the nice world > of MS .net (while he is using MS-SQL exclusively). To make the thing > clear the issue is the gal database of infectious diseases in Germany > runned by the Robert Koch-Institute. So the beast could be of some > importance for increasing the acceptance of PostgreSQL and Open Source > in the field of medicine which is generally known for the money which > is involved in. So I really hope that some skilled programmers would > be able to find a good way to solve the performance issue perhaps by > just profiling the simple query > > SELECT Hauptdaten_Fall.MeldeKategorie, Count(Hauptdaten_Fall.ID) AS Anz FROM Hauptdaten_Fall WHERE (((Hauptdaten_Fall.IstAktuell)=20)) > GROUP BY Hauptdaten_Fall.MeldeKategorie ORDER BY Hauptdaten_Fall.MeldeKategorie; > > to the data set I put on > > http://www.foodborne-net.de/~tillea/ifsg/ifsgtest.dump.bz2 > > If this should take less than half a second on a modern PC I could > continue to try mo realistic queries. > > I really hope that I could readjust the image of PostgreSQL in the > eyes of my M$-centered colleagues. > > Kind regards > > Andreas. > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html -- Ross Reedstrom, Ph.D. reedstrm@rice.edu Executive Director phone: 713-348-6166 Gulf Coast Consortium for Bioinformatics fax: 713-348-6182 Rice University MS-39 Houston, TX 77005
В списке pgsql-hackers по дате отправления: