Re: BUG #15007: LIMIT not respected in sub-queries
От | Will Storey |
---|---|
Тема | Re: BUG #15007: LIMIT not respected in sub-queries |
Дата | |
Msg-id | 20180119015506.wdohq7r7h4rj2jfq@dev.null обсуждение исходный текст |
Ответ на | Re: BUG #15007: LIMIT not respected in sub-queries (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: BUG #15007: LIMIT not respected in sub-queries
|
Список | pgsql-bugs |
On Thu 2018-01-18 00:08:31 +0100, Tomas Vondra wrote: <snip> > I went through the test case, and I think I know what's going on. The > script did not reproduce the issue for me, but I think I've been able to > tweak the query to use a plan producing more issues. > > The key is to force a particular join order, and disable operations that > would materialize intermediate results of the t1 scan. That is, we want > something like this: > > -> nestloop > -> nestloop t2 + t3 > -> seq scan t1 > > Firstly, I've replaced the dynamic random() condition with a static one: > > random() <= 0.5 > > this is not strictly necessary, but it simplifies the plan. We need the > random() call though, as I'll explain later. > > Then I've disabled a bunch of plan nodes that would materialize results > of the t1 seq scan: > > set enable_material = off; > set enable_sort = off; > set enable_hashjoin = off; > > And finally, I've disabled join reordering by setting > > set join_collapse_limit = 1; > > Now, if you rewrite the query like this (which essentially just forces a > particular join order, when combined with join_collapse_limit=1, nothing > else): > > explain analyze SELECT * FROM > (t3 JOIN t2 USING (t2_id)) JOIN > (SELECT * FROM t1 > WHERE t1_id IS NOT NULL AND > t1_id < 100 AND > t1_val LIKE 'h%' AND > random() <= 0.5 > LIMIT 5 > ) AS t1 > ON t3.t1_id = t1.t1_id > WHERE t2.t2_val LIKE 'he%'; > > you will get plans like this: > > > QUERY PLAN > ------------------------------------------------------------------------ > Nested Loop (cost=0.00..31.41 rows=5 width=21) > (actual time=0.052..1.825 rows=7 loops=1) > Join Filter: (t3.t1_id = t1.t1_id) > Rows Removed by Join Filter: 73 > -> Nested Loop (cost=0.00..16.10 rows=16 width=14) > (actual time=0.030..0.917 rows=16 loops=1) > Join Filter: (t3.t2_id = t2.t2_id) > Rows Removed by Join Filter: 160 > -> Seq Scan on t2 (cost=0.00..1.14 rows=11 width=10) > (actual time=0.015..0.041 rows=11 loops=1) > Filter: ((t2_val)::text ~~ 'he%'::text) > -> Seq Scan on t3 (cost=0.00..1.16 rows=16 width=8) > (actual time=0.003..0.037 rows=16 loops=11) > -> Limit (cost=0.00..0.84 rows=5 width=7) > (actual time=0.008..0.037 rows=5 loops=16) > -> Seq Scan on t1 (cost=0.00..1.52 rows=9 width=7) > (actual time=0.004..0.016 rows=5 loops=16) > Filter: ((t1_id IS NOT NULL) AND (t1_id < 100) AND > ((t1_val)::text ~~ 'h%'::text) AND > (random() <= '0.5'::double precision)) > Rows Removed by Filter: 5 > Planning time: 0.625 ms > Execution time: 1.911 ms > (15 rows) > > > This should be equivalent to the original query, and should produce the > same results (modulo random() of course). > > But notice it actually does produce 7 rows! > > Nested Loop (cost=0.00..31.41 rows=5 width=21) > (actual time=0.052..1.825 rows=7 loops=1) > ^ > > The problem is that it ends up executing the sequential scan on t1 > repeatedly (because it's the inner relation in a nested loop), and > because random() is volatile, the results of the scan are likely different. > > Each rescan individually still respects the LIMIT, but when combined > result may be larger - there may be more unique IDs, matching additional > rows from the other tables. > > > > > I believe a better way to write this query would be to use a CTE. > > > > Yes, that will stabilize the output of the random() function, > eliminating the nondeterminism during rescans. Wow, thank you for the great explanation of what is going on! It sounds like this is not really a bug then and is just something to be expected when using LIMIT with such volatile joins. I suppose I expected that if there is a LIMIT then that would be the maximum number of rows the subquery would ever provide. The planner would have to force materializing/stabilizing in such cases it sounds like. Maybe that is not possible or not a good idea. It does seem like a pretty edge case. Thanks again!
В списке pgsql-bugs по дате отправления:
Предыдущее
От: Kyotaro HORIGUCHIДата:
Сообщение: Re: Index-only scan returns incorrect results when using acomposite GIST index with a gist_trgm_ops column.
Следующее
От: Sandeep ThakkarДата:
Сообщение: Re: BUG #15015: Zilib1.dll library missing if only command line tools installed