Обсуждение: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?

Поиск
Список
Период
Сортировка
Something goes wrong that this query plan thinks there is only gonna be
1 row from (SELECT uid FROM alog ... ) so chooses such query plan, and
thus it runs forever (at least so long that I didn't bother to wait,
like 10 minutes):


miernik=> EXPLAIN UPDATE cnts SET p0 = FALSE WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1);
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Nested Loop IN Join  (cost=0.00..3317.34 rows=1 width=44)
   ->  Seq Scan on cnts  (cost=0.00..36.00 rows=2000 width=44)
   ->  Index Scan using alog_uid_idx on alog  (cost=0.00..296.95 rows=1 width=4)
         Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
         Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric))
(5 rows)


But if I give him only the inner part, it makes reasonable assumptions
and runs OK:



miernik=> EXPLAIN SELECT uid FROM alog WHERE pid = 3452654 AND o = 1;
                                       QUERY PLAN
-----------------------------------------------------------------------------------------
 Bitmap Heap Scan on alog  (cost=100.21..9559.64 rows=3457 width=4)
   Recheck Cond: ((pid = 3452654::numeric) AND (o = 1::numeric))
   ->  Bitmap Index Scan on alog_pid_o_idx  (cost=0.00..99.35 rows=3457 width=0)
         Index Cond: ((pid = 3452654::numeric) AND (o = 1::numeric))
(4 rows)



Can't show you EXPLAIN ANALYZE for the first one, as it also runds
forver. For the second one, its consistent with the EXPLAIN.



Before it was running OK, but I recently disabled autovacuum and now run
VACUUM manually serveal times a day, and run ANALYZE manually on alog
and cnts tables before runnign the above. How may I fix this to work?

shared_buffers = 5MB
work_mem = 1MB
Machine is a 48 MB RAM Xen.

But not the disabling autovacuum broke it, but running ANALYZE manually
on the tables broke it, and I don't know why, I thougt ANALYZE would
improve the guesses?

--
Miernik
http://miernik.name/

Miernik <public@public.miernik.name> writes:
> miernik=> EXPLAIN UPDATE cnts SET p0 = FALSE WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1);
>                                           QUERY PLAN
> -----------------------------------------------------------------------------------------------
>  Nested Loop IN Join  (cost=0.00..3317.34 rows=1 width=44)
>    ->  Seq Scan on cnts  (cost=0.00..36.00 rows=2000 width=44)
>    ->  Index Scan using alog_uid_idx on alog  (cost=0.00..296.95 rows=1 width=4)
>          Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
>          Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric))
> (5 rows)

> But if I give him only the inner part, it makes reasonable assumptions
> and runs OK:

What's the results for

explain select * from cnts, alog where alog.uid = cnts.uid

?

If necessary, turn off enable_hashjoin and enable_mergejoin so we can
see a comparable plan.  I'm suspecting it thinks the condition on
uid is more selective than the one on the other index.

            regards, tom lane

Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Miernik <public@public.miernik.name> writes:
>> miernik=> EXPLAIN UPDATE cnts SET p0 = FALSE WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1);
>>                                           QUERY PLAN
>> -----------------------------------------------------------------------------------------------
>>  Nested Loop IN Join  (cost=0.00..3317.34 rows=1 width=44)
>>    ->  Seq Scan on cnts  (cost=0.00..36.00 rows=2000 width=44)
>>    ->  Index Scan using alog_uid_idx on alog  (cost=0.00..296.95 rows=1 width=4)
>>          Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
>>          Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric))
>> (5 rows)
>
>> But if I give him only the inner part, it makes reasonable assumptions
>> and runs OK:
>
> What's the results for
>
> explain select * from cnts, alog where alog.uid = cnts.uid

miernik=> explain select * from cnts, alog where alog.uid = cnts.uid;
                                 QUERY PLAN
----------------------------------------------------------------------------
 Hash Join  (cost=61.00..71810.41 rows=159220 width=76)
   Hash Cond: ((alog.uid)::integer = (cnts.uid)::integer)
   ->  Seq Scan on alog  (cost=0.00..54951.81 rows=3041081 width=37)
   ->  Hash  (cost=36.00..36.00 rows=2000 width=39)
         ->  Seq Scan on cnts  (cost=0.00..36.00 rows=2000 width=39)
(5 rows)

> If necessary, turn off enable_hashjoin and enable_mergejoin so we can
> see a comparable plan.

After doing that it thinks like this:

miernik=> explain select * from cnts, alog where alog.uid = cnts.uid;
                                       QUERY PLAN
-----------------------------------------------------------------------------------------
 Nested Loop  (cost=4.95..573640.43 rows=159220 width=76)
   ->  Seq Scan on cnts  (cost=0.00..36.00 rows=2000 width=39)
   ->  Bitmap Heap Scan on alog  (cost=4.95..285.80 rows=80 width=37)
         Recheck Cond: ((alog.uid)::integer = (cnts.uid)::integer)
         ->  Bitmap Index Scan on alog_uid_idx  (cost=0.00..4.93 rows=80 width=0)
               Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
(6 rows)

Trying EXPLAIN ANALZYE now on this makes it run forever...

How can I bring it back to working? Like un-run ANALYZE on that table or
something? All was running reasonably well before I changed from
autovacuum to running ANALYZE manually, and I thought I would improve
performance... ;(

--
Miernik
http://miernik.name/

Miernik <public@public.miernik.name> wrote:
> How can I bring it back to working? Like un-run ANALYZE on that table or
> something? All was running reasonably well before I changed from
> autovacuum to running ANALYZE manually, and I thought I would improve
> performance... ;(

I now removed all manual ANALYZE commands from the scripts, set
enable_hashjoin = on
enable_mergejoin = on
and set on autovacuum, but it didn't bring back the performance of that query :(

--
Miernik
http://miernik.name/

Miernik <public@public.miernik.name> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If necessary, turn off enable_hashjoin and enable_mergejoin so we can
>> see a comparable plan.

> After doing that it thinks like this:

> miernik=> explain select * from cnts, alog where alog.uid = cnts.uid;
>                                        QUERY PLAN
> -----------------------------------------------------------------------------------------
>  Nested Loop  (cost=4.95..573640.43 rows=159220 width=76)
>    ->  Seq Scan on cnts  (cost=0.00..36.00 rows=2000 width=39)
>    ->  Bitmap Heap Scan on alog  (cost=4.95..285.80 rows=80 width=37)
>          Recheck Cond: ((alog.uid)::integer = (cnts.uid)::integer)
>          ->  Bitmap Index Scan on alog_uid_idx  (cost=0.00..4.93 rows=80 width=0)
>                Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
> (6 rows)

> Trying EXPLAIN ANALZYE now on this makes it run forever...

It couldn't run very long if those rowcounts were accurate.  How many
rows in "cnts" really?  How big is "alog", and how many of its rows join
to "cnts"?

While I'm looking at this, what's the real datatypes of the uid columns?
Those explicit coercions seem a bit fishy.

> How can I bring it back to working?

It's premature to ask for a solution when we don't understand the
problem.

            regards, tom lane

On Sat, Aug 09, 2008 at 05:37:29PM -0400, Tom Lane wrote:
> > miernik=> explain select * from cnts, alog where alog.uid = cnts.uid;
> >                                        QUERY PLAN
> > -----------------------------------------------------------------------------------------
> >  Nested Loop  (cost=4.95..573640.43 rows=159220 width=76)
> >    ->  Seq Scan on cnts  (cost=0.00..36.00 rows=2000 width=39)
> >    ->  Bitmap Heap Scan on alog  (cost=4.95..285.80 rows=80 width=37)
> >          Recheck Cond: ((alog.uid)::integer = (cnts.uid)::integer)
> >          ->  Bitmap Index Scan on alog_uid_idx  (cost=0.00..4.93 rows=80 width=0)
> >                Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
> > (6 rows)
>
> > Trying EXPLAIN ANALZYE now on this makes it run forever...
>
> It couldn't run very long if those rowcounts were accurate.

Well, count "over 5 minutes" as "forever".

> How many
> rows in "cnts" really?  How big is "alog", and how many of its rows join
> to "cnts"?

cnts is exactly 1000 rows.
alog as a whole is now 3041833 rows

"SELECT uid FROM alog WHERE pid = 3452654 AND o = 1" gives 870 rows (202
of them NULL), but if I would first try to JOIN alog to cnts, that would
be really huge, like roughly 100000 rows, so to have this work
reasonably well, it MUST first filter alog on pid and o, and then JOIN
the result to cnts, not the other way around. Trying to fist JOIN alog to
cnts, and then filter the whole thing on pid and o is excessively stupid
in this situation, isn't it?

>
> While I'm looking at this, what's the real datatypes of the uid columns?
> Those explicit coercions seem a bit fishy.

uid is of DOMAIN uid which is defined as:

CREATE DOMAIN uid AS integer CHECK (VALUE > 0);

But I don't think its a problem. It was working perfectly for serveral
months until yesterday I decided to mess with autovacuum and manual
ANALYZE.

--
Miernik
http://miernik.name/

Miernik <public@public.miernik.name> wrote:
> Something goes wrong that this query plan thinks there is only gonna be
> 1 row from (SELECT uid FROM alog ... ) so chooses such query plan, and
> thus it runs forever (at least so long that I didn't bother to wait,
> like 10 minutes):
>
>
> miernik=> EXPLAIN UPDATE cnts SET p0 = FALSE WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1);
>                                          QUERY PLAN
> -----------------------------------------------------------------------------------------------
> Nested Loop IN Join  (cost=0.00..3317.34 rows=1 width=44)
>   ->  Seq Scan on cnts  (cost=0.00..36.00 rows=2000 width=44)
>   ->  Index Scan using alog_uid_idx on alog  (cost=0.00..296.95 rows=1 width=4)
>         Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
>         Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric))
> (5 rows)

Well, in fact its not only the autovacuum/manual VACUUM ANALYZE that
changed, its a new copy of the cnts table with only 1000 rows, and
before it was a 61729 row table. The new, smaller, 1000 row table is
recreated, but I have a copy of the old 61729 row table, and guess what?
It runs correctly! And the query plan of the exactly the same query, on
a table of the exactly same structure and indexes, differing only by
having 61729 rows instead of 1000 rows, is like this:

I've done a SELECT uid plan, instead of an UPDATE plan, but it should
be no difference. This is a plan that is quick:

miernik=> EXPLAIN SELECT uid FROM cnts_old WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1);
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
 Nested Loop  (cost=9077.07..9238.61 rows=12 width=4)
   ->  HashAggregate  (cost=9077.07..9077.29 rows=22 width=4)
         ->  Bitmap Heap Scan on alog  (cost=93.88..9069.00 rows=3229 width=4)
               Recheck Cond: ((pid = 3452654::numeric) AND (o = 1::numeric))
               ->  Bitmap Index Scan on alog_pid_o  (cost=0.00..93.07 rows=3229 width=0)
                     Index Cond: ((pid = 3452654::numeric) AND (o = 1::numeric))
   ->  Index Scan using cnts_old_pkey on cnts_old  (cost=0.00..7.32 rows=1 width=4)
         Index Cond: ((cnts_old.uid)::integer = (alog.uid)::integer)
(8 rows)


I present a SELECT uid plan with the 1000 table also below, just to be
sure, this is the "bad" plan, that takes forever:

miernik=> EXPLAIN SELECT uid FROM cnts WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1);
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Nested Loop IN Join  (cost=0.00..3532.70 rows=1 width=4)
   ->  Seq Scan on cnts  (cost=0.00..26.26 rows=1026 width=4)
   ->  Index Scan using alog_uid_idx on alog  (cost=0.00..297.32 rows=1 width=4)
         Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
         Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric))
(5 rows)


I've also got a version of the cnts table with only 14 rows, called
cnts_small, and the query plan on that one is below:

miernik=> EXPLAIN SELECT uid FROM cnts_small WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1);
                                                                          QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop IN Join  (cost=99.05..1444.29 rows=1 width=4)
   ->  Seq Scan on cnts_small  (cost=0.00..1.14 rows=14 width=4)
   ->  Bitmap Heap Scan on alog  (cost=99.05..103.07 rows=1 width=4)
         Recheck Cond: (((alog.uid)::integer = (cnts_small.uid)::integer) AND (alog.pid = 3452654::numeric) AND (alog.o
=1::numeric)) 
         ->  BitmapAnd  (cost=99.05..99.05 rows=1 width=0)
               ->  Bitmap Index Scan on alog_uid_idx  (cost=0.00..5.21 rows=80 width=0)
                     Index Cond: ((alog.uid)::integer = (cnts_small.uid)::integer)
               ->  Bitmap Index Scan on alog_pid_o  (cost=0.00..92.78 rows=3229 width=0)
                     Index Cond: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric))
(9 rows)

That one is fast too. And the structure and indexes of cnts_small is
exactly the same as of cnts and cnts_old. So it works OK if I use a 14
row table and if I use a 61729 row table, but breaks when I use a 1000
row table. Any ideas?

--
Miernik
http://miernik.name/

Miernik <public@public.miernik.name> wrote:
> I present a SELECT uid plan with the 1000 table also below, just to be
> sure, this is the "bad" plan, that takes forever:
>
> miernik=> EXPLAIN SELECT uid FROM cnts WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1);
>                                          QUERY PLAN
> -----------------------------------------------------------------------------------------------
> Nested Loop IN Join  (cost=0.00..3532.70 rows=1 width=4)
>   ->  Seq Scan on cnts  (cost=0.00..26.26 rows=1026 width=4)
>   ->  Index Scan using alog_uid_idx on alog  (cost=0.00..297.32 rows=1 width=4)
>         Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
>         Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric))
> (5 rows)

If I reduce the number of rows in cnts to 100, I can actually make an
EXPLAIN ANALYZE with this query plan finish in reasonable time:

miernik=> EXPLAIN ANALYZE SELECT uid FROM cnts WHERE uid IN (SELECT uid FROM alog WHERE pid = 555949 AND odp = 1);
                                                                   QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop IN Join  (cost=0.00..3585.54 rows=1 width=4) (actual time=51831.430..267844.815 rows=7 loops=1)
   ->  Seq Scan on cnts  (cost=0.00..14.00 rows=700 width=4) (actual time=0.005..148.464 rows=100 loops=1)
   ->  Index Scan using alog_uid_idx on alog  (cost=0.00..301.02 rows=1 width=4) (actual time=2676.959..2676.959 rows=0
loops=100)
         Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
         Filter: ((alog.pid = 555949::numeric) AND (alog.odp = 1::numeric))
 Total runtime: 267844.942 ms
(6 rows)

The real running times are about 10 times more than the estimates. Is
that normal?

--
Miernik
http://miernik.name/

Re: why query plan for the inner SELECT of WHERE x IN is wrong, but when run the inner query alone is OK?

От
"Scott Carey"
Дата:
It is hardware dependent.   The estimates are not time estimates, but on an arbitrary scale.

On the server I work with, the estimates are almost always 10x larger than the run times, and sometimes more than 50x.

(many GBs RAM, 8 CPU cores, more than 10 disks, standard optimizer settings other than statistics sample sizes and increased common values for columns).

-Scott

On Sat, Aug 9, 2008 at 7:20 PM, Miernik <public@public.miernik.name> wrote:
> I present a SELECT uid plan with the 1000 table also below, just to be
> sure, this is the "bad" plan, that takes forever:
>
> miernik=> EXPLAIN SELECT uid FROM cnts WHERE uid IN (SELECT uid FROM alog WHERE pid = 3452654 AND o = 1);
>                                          QUERY PLAN
> -----------------------------------------------------------------------------------------------
> Nested Loop IN Join  (cost=0.00..3532.70 rows=1 width=4)
>   ->  Seq Scan on cnts  (cost=0.00..26.26 rows=1026 width=4)
>   ->  Index Scan using alog_uid_idx on alog  (cost=0.00..297.32 rows=1 width=4)
>         Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
>         Filter: ((alog.pid = 3452654::numeric) AND (alog.o = 1::numeric))
> (5 rows)

If I reduce the number of rows in cnts to 100, I can actually make an
EXPLAIN ANALYZE with this query plan finish in reasonable time:

miernik=> EXPLAIN ANALYZE SELECT uid FROM cnts WHERE uid IN (SELECT uid FROM alog WHERE pid = 555949 AND odp = 1);
                                                                  QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop IN Join  (cost=0.00..3585.54 rows=1 width=4) (actual time=51831.430..267844.815 rows=7 loops=1)
  ->  Seq Scan on cnts  (cost=0.00..14.00 rows=700 width=4) (actual time=0.005..148.464 rows=100 loops=1)
  ->  Index Scan using alog_uid_idx on alog  (cost=0.00..301.02 rows=1 width=4) (actual time=2676.959..2676.959 rows=0 loops=100)
        Index Cond: ((alog.uid)::integer = (cnts.uid)::integer)
        Filter: ((alog.pid = 555949::numeric) AND (alog.odp = 1::numeric))
 Total runtime: 267844.942 ms
(6 rows)

The real running times are about 10 times more than the estimates. Is
that normal?

--
Miernik
http://miernik.name/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance