Re: BUG #15007: LIMIT not respected in sub-queries

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: BUG #15007: LIMIT not respected in sub-queries
Дата
Msg-id f74052d0-828c-428d-efe6-0d32d029fa21@2ndquadrant.com
обсуждение исходный текст
Ответ на BUG #15007: LIMIT not respected in sub-queries  (PG Bug reporting form <noreply@postgresql.org>)
Ответы Re: BUG #15007: LIMIT not respected in sub-queries  (Will Storey <will@summercat.com>)
Список pgsql-bugs
Hi,

On 01/11/2018 10:16 PM, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      15007
> Logged by:          Will Storey
> Email address:      will@summercat.com
> PostgreSQL version: 10.1
> Operating system:   Ubuntu 16.04
> Description:        
> 
> Hello,
> 
> I am not sure this is a bug. But it is surprising to me and seems to
> contradict the documentation in terms of join nesting.
> 
> I have a SELECT query with a sub-SELECT in it. The sub-SELECT has a LIMIT
> clause. I've found that sometimes I receive more rows (at most one extra in
> my testing) than the LIMIT, where I expected only as many rows as the LIMIT.
> This depends on the query plan. With some plans it never happens, and with
> others it happens frequently.
> 
> In looking into this behaviour, I came across hints that this is a known
> quirk. I found bug reports related specifically to UPDATE/DELETE that sound
> similar to this, but no mention that the behaviour can happen with SELECT:
> 
> https://dba.stackexchange.com/questions/69471/postgres-update-limit-1?noredirect=1&lq=1
> (note the comments on the accepted answer)
> https://www.postgresql.org/message-id/1399649764731-5803406.post%40n5.nabble.com
> (and the thread)
> https://www.postgresql.org/message-id/1385918761589-5781081.post%40n5.nabble.com
> 
> This happens with both PostgreSQL 10.1 on Ubuntu 16.04 (from the PostgreSQL
> repos: PostgreSQL 10.1 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
> 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit) as well as on PostgreSQL
> 9.6.5 (where I initially encountered the behaviour).
> 

I think you're right those issues are related - a change of plan may
affect the results in a somewhat non-deterministic way. In the cases you
linked it's UPDATE on additional rows, in your case it's more about
random() during rescans. I'll get to that in a minute.

> Unfortunately my test case is not very clean and it is somewhat long, so
> I've put it in a gist on GitHub:
> 
> https://gist.github.com/horgh/f3e8ede81d866844e7d162d677968bf0
> 
> The SELECT query (run by the Perl program) quickly prints out that it
> receives 6 rows.
> 
> As you can see in the EXPLAIN ANALYZE output, the innermost Nested Loop has
> loops > 1. I believe this is the cause of the behaviour. If I tweak the test
> to have a plan where that node runs before the Seq Scan, there are never
> more than 5 rows.
>

I went through the test case, and I think I know what's going on. The
script did not reproduce the issue for me, but I think I've been able to
tweak the query to use a plan producing more issues.

The key is to force a particular join order, and disable operations that
would materialize intermediate results of the t1 scan. That is, we want
something like this:

   -> nestloop
       -> nestloop t2 + t3
       -> seq scan t1

Firstly, I've replaced the dynamic random() condition with a static one:

    random() <= 0.5

this is not strictly necessary, but it simplifies the plan. We need the
random() call though, as I'll explain later.

Then I've disabled a bunch of plan nodes that would materialize results
of the t1 seq scan:

    set enable_material = off;
    set enable_sort = off;
    set enable_hashjoin = off;

And finally, I've disabled join reordering by setting

    set join_collapse_limit = 1;

Now, if you rewrite the query like this (which essentially just forces a
particular join order, when combined with join_collapse_limit=1, nothing
else):

explain analyze SELECT * FROM
(t3 JOIN t2 USING (t2_id)) JOIN
(SELECT * FROM t1
 WHERE t1_id IS NOT NULL AND
 t1_id < 100 AND
 t1_val LIKE 'h%' AND
 random() <= 0.5
 LIMIT 5
) AS t1
ON t3.t1_id = t1.t1_id
WHERE t2.t2_val LIKE 'he%';

you will get plans like this:


                              QUERY PLAN
------------------------------------------------------------------------
 Nested Loop  (cost=0.00..31.41 rows=5 width=21)
              (actual time=0.052..1.825 rows=7 loops=1)
   Join Filter: (t3.t1_id = t1.t1_id)
   Rows Removed by Join Filter: 73
   ->  Nested Loop  (cost=0.00..16.10 rows=16 width=14)
                    (actual time=0.030..0.917 rows=16 loops=1)
         Join Filter: (t3.t2_id = t2.t2_id)
         Rows Removed by Join Filter: 160
         ->  Seq Scan on t2  (cost=0.00..1.14 rows=11 width=10)
                             (actual time=0.015..0.041 rows=11 loops=1)
               Filter: ((t2_val)::text ~~ 'he%'::text)
         ->  Seq Scan on t3  (cost=0.00..1.16 rows=16 width=8)
                             (actual time=0.003..0.037 rows=16 loops=11)
   ->  Limit  (cost=0.00..0.84 rows=5 width=7)
              (actual time=0.008..0.037 rows=5 loops=16)
         ->  Seq Scan on t1  (cost=0.00..1.52 rows=9 width=7)
                             (actual time=0.004..0.016 rows=5 loops=16)
               Filter: ((t1_id IS NOT NULL) AND (t1_id < 100) AND
                        ((t1_val)::text ~~ 'h%'::text) AND
                         (random() <= '0.5'::double precision))
               Rows Removed by Filter: 5
 Planning time: 0.625 ms
 Execution time: 1.911 ms
(15 rows)


This should be equivalent to the original query, and should produce the
same results (modulo random() of course).

But notice it actually does produce 7 rows!

 Nested Loop  (cost=0.00..31.41 rows=5 width=21)
              (actual time=0.052..1.825 rows=7 loops=1)
                                             ^

The problem is that it ends up executing the sequential scan on t1
repeatedly (because it's the inner relation in a nested loop), and
because random() is volatile, the results of the scan are likely different.

Each rescan individually still respects the LIMIT, but when combined
result may be larger - there may be more unique IDs, matching additional
rows from the other tables.

> 
> I believe a better way to write this query would be to use a CTE.
> 

Yes, that will stabilize the output of the random() function,
eliminating the nondeterminism during rescans.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-bugs по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: BUG #15006: "make check" error if current user is "user"
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #15006: "make check" error if current user is "user"