Обсуждение: join question
GJ
"=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?=" <gryzman@gmail.com> writes: > that's 20s query, and now I got it down to 10s , by using something - which > in my eyes would be always wrong - and against all logic. So if someone > could please explain to me why is it faster: [ shrug... ] If you aren't going to show us EXPLAIN ANALYZE output, we're even more in the dark than you are. regards, tom lane
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=37807.04..37807.05 rows=1 width=50) (actual time=9788.642..9805.832 rows=20000 loops=1)
Sort Key: s.nodeid
Sort Method: external sort Disk: 1320kB
-> Nested Loop (cost=376.60..37807.03 rows=1 width=50) (actual time=15.454..9629.198 rows=20000 loops=1)
-> Nested Loop Anti Join (cost=376.60..37800.27 rows=1 width=50) (actual time=15.347..9077.445 rows=20000 loops=1)
-> Nested Loop (cost=376.60..37797.99 rows=1 width=50) (actual time=15.308..8927.428 rows=20000 loops=1)
-> Hash Anti Join (cost=376.60..37791.22 rows=1 width=8) (actual time=15.195..8216.448 rows=20000 loops=1)
Hash Cond: (e.accountid = account.id)
-> Bitmap Heap Scan on efoo e (cost=368.23..37709.63 rows=19523 width=8) (actual time=14.981..8166.262 rows=20000 loops=1)
Recheck Cond: (packageid = 497)
Filter: ((startdate <= now()) AND (enddate > now()))
-> Bitmap Index Scan on efoo_packageid_idx (cost=0.00..363.35 rows=19523 width=0) (actual time=9.694..9.694 rows=20000 loops=1)
Index Cond: (packageid = 497)
-> Hash (cost=8.35..8.35 rows=1 width=8) (actual time=0.136..0.136 rows=1 loops=1)
-> Index Scan using account_banned_idx on account (cost=0.00..8.35 rows=1 width=8) (actual time=0.129..0.131 rows=1 loops=1)
Index Cond: (banned = true)
Filter: banned
-> Index Scan using bbaccididx on bb s (cost=0.00..6.76 rows=1 width=42) (actual time=0.030..0.032 rows=1 loops=20000)
Index Cond: (s.accountid = e.accountid)
-> Index Scan using bbar_bbid_key on bbar b (cost=0.00..2.27 rows=1 width=11) (actual time=0.005..0.005 rows=0 loops=20000)
Index Cond: ((b.bbid)::text = (s.id)::text)
-> Index Scan using acct_ididx on account a (cost=0.00..6.75 rows=1 width=24) (actual time=0.024..0.025 rows=1 loops=20000)
Index Cond: (a.id = e.accountid)
Total runtime: 9815.280 ms
and before:
QUERY PLAN before
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=130129.98..130178.78 rows=19521 width=50) (actual time=16156.145..16170.234 rows=20000 loops=1)
Sort Key: s.nodeid
Sort Method: external merge Disk: 1312kB
-> Hash Anti Join (cost=78755.00..128101.84 rows=19521 width=50) (actual time=12836.008..16071.668 rows=20000 loops=1)
Hash Cond: ((s.id)::text = (b.bbid)::text)
-> Hash Join (cost=78752.17..127830.60 rows=19523 width=50) (actual time=12825.755..16043.271 rows=20000 loops=1)
Hash Cond: (e.accountid = s.accountid)
-> Merge Join (cost=39100.97..79171.13 rows=19523 width=32) (actual time=11496.544..12614.860 rows=20000 loops=1)
Merge Cond: (a.id = e.accountid)
-> Index Scan using acct_ididx on account a (cost=0.00..37277.39 rows=1000002 width=24) (actual time=0.183..859.610 rows=999950 loops=1)
Filter: (banned <> true)
-> Sort (cost=39100.93..39149.73 rows=19523 width=8) (actual time=11496.268..11507.031 rows=20000 loops=1)
Sort Key: e.accountid
Sort Method: external sort Disk: 472kB
-> Bitmap Heap Scan on efoo e (cost=368.23..37709.63 rows=19523 width=8) (actual time=14.640..11395.226 rows=20000 loops=1)
Recheck Cond: (packageid = 497)
Filter: ((startdate <= now()) AND (enddate > now()))
-> Bitmap Index Scan on efoo_packageid_idx (cost=0.00..363.35 rows=19523 width=0) (actual time=9.377..9.377 rows=20000 loops=1)
Index Cond: (packageid = 497)
-> Hash (cost=18850.09..18850.09 rows=1000009 width=42) (actual time=1326.158..1326.158 rows=1000009 loops=1)
-> Seq Scan on bb s (cost=0.00..18850.09 rows=1000009 width=42) (actual time=0.032..424.731 rows=1000009 loops=1)
-> Hash (cost=1.81..1.81 rows=81 width=11) (actual time=10.111..10.111 rows=81 loops=1)
-> Seq Scan on bbar b (cost=0.00..1.81 rows=81 width=11) (actual time=9.971..10.013 rows=81 loops=1)
Total runtime: 16206.639 ms
(24 rows)
Time: 16217,107 ms
--
GJ
--
GJ
"=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?=" <gryzman@gmail.com> writes: > so here are the plans, that's the real table run. Hmm, well this rowcount estimate is way off: > -> Hash Anti Join (cost=376.60..37791.22 rows=1 > width=8) (actual time=15.195..8216.448 rows=20000 loops=1) The fact that it's getting a faster plan despite being completely wrong about the rowcount means that the cost parameters are way off for your situation. It looks like you are testing a case where the tables all fit in memory. Do you expect that to be the reality for your production use? If so, you might want to reduce random_page_cost to something close to 1 to reflect it. If not, it'd be a good idea to test with more realistically-sized tables before deciding what's "faster". I'm not sure why the rowcount estimate is so far off, but the antijoin code is all new and probably there's an estimation bug in there somewhere. (You didn't get this plan, or anything at all like it, from 8.1 ;-)) regards, tom lane
"Grzegorz Jaśkiewicz" <gryzman@gmail.com> writes:> so here are the plans, that's the real table run.Hmm, well this rowcount estimate is way off:The fact that it's getting a faster plan despite being completely wrong
> -> Hash Anti Join (cost=376.60..37791.22 rows=1
> width=8) (actual time=15.195..8216.448 rows=20000 loops=1)
about the rowcount means that the cost parameters are way off for your
situation. It looks like you are testing a case where the tables all
fit in memory. Do you expect that to be the reality for your production
use? If so, you might want to reduce random_page_cost to something
close to 1 to reflect it. If not, it'd be a good idea to test with more
realistically-sized tables before deciding what's "faster".
I'm not sure why the rowcount estimate is so far off, but the antijoin
code is all new and probably there's an estimation bug in there
somewhere. (You didn't get this plan, or anything at all like it,
from 8.1 ;-))
--
GJ
> Sort Method: external sort Disk: 1320kB One simple speedup could be upping Your work_mem to 2M for this query, so the sorts are in memory. btw: Last time I used Postgres, it did not show the sort method. Cool. Greetings Marcin Mank
It looks like you are testing a case where the tables all
fit in memory. Do you expect that to be the reality for your production
use? If so, you might want to reduce random_page_cost to something
close to 1 to reflect it. If not, it'd be a good idea to test with more
realistically-sized tables before deciding what's "faster".
--
GJ
"=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?=" <gryzman@gmail.com> writes: > On Thu, Oct 23, 2008 at 12:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm not sure why the rowcount estimate is so far off, but the antijoin >> code is all new and probably there's an estimation bug in there >> somewhere. (You didn't get this plan, or anything at all like it, >> from 8.1 ;-)) >> > nope, that's up2date cvs head. I always test stuff on cvs head first, I just committed a patch that might help a bit with that. regards, tom lane