Re: Odd sorting behaviour
От | Rod Taylor |
---|---|
Тема | Re: Odd sorting behaviour |
Дата | |
Msg-id | 1090376297.12565.9.camel@jester обсуждение исходный текст |
Ответ на | Re: Odd sorting behaviour ("Steinar H. Gunderson" <sgunderson@bigfoot.com>) |
Ответы |
Re: Odd sorting behaviour
|
Список | pgsql-performance |
> I could of course post the updated query plan if anybody is interested; let > me know. (The data is still available if anybody needs it as well, of > course.) I've taken a look and managed to cut out quite a bit of used time. You'll need to confirm it's the same results though (I didn't -- it is the same number of results (query below) First off, "DROP INDEX prodid_index;". It doesn't help anything since the primary key is just as usable, but it does take enough space that it causes thrashing in the buffer_cache. Any queries based on prodid will use the index for the PRIMARY KEY instead. Secondly, I had no luck getting the hashjoin but this probably doesn't matter. I've assumed that the number of users will climb faster than the product set offered, and generated additional data via the below command run 4 times: INSERT INTO opinions SELECT prodid, uid + (SELECT max(uid) FROM opinions), opinion FROM opinions; I found that by this point, the hashjoin and mergejoin have essentially the same performance -- in otherwords, as you grow you'll want the mergejoin eventually so I wouldn't worry about it too much. New Query cuts about 1/3rd the time, forcing hashjoin gets another 1/3rd but see the above note: SELECT o3.prodid , SUM(o3.opinion*o12.correlation) AS total_correlation FROM opinions o3 -- Plain join okay since o12.correlation <> 0 -- eliminates any NULLs anyway. -- Was RIGHT JOIN JOIN (SELECT o2.uid , SUM(o1.opinion*o2.opinion)/SQRT(count(*)::numeric) AS correlation FROM opinions AS o1 JOIN opinions AS o2 USING (prodid) WHERE o1.uid = 1355 GROUP BY o2.uid ) AS o12 USING (uid) -- Was old Left join WHERE o3.prodid NOT IN (SELECT prodid FROM opinions AS o4 WHERE uid = 1355) AND o3.opinion <> 0 AND o12.correlation <> 0 GROUP BY o3.prodid ORDER BY total_correlation desc;
В списке pgsql-performance по дате отправления: