Обсуждение: Is this planner choice easily explained?

Поиск
Список
Период
Сортировка

Is this planner choice easily explained?

От
Philip Warner
Дата:
This is a summary from a thread on the dbmail mailing list.

I am trying to understand some apparently odd behaviour with 7.2.1 and
7.2.3. Adding a LIMIT statement causes what seems like a very bad choice of
strategy. The first statement is fine:

     explain SELECT messageblk FROM messageblks WHERE message_idnr =
100::bigint
     ORDER BY messageblk_idnr ;

gives:

     Sort  (cost=5793.33..5793.33 rows=1453 width=40)
       ->  Index Scan using messageblks_msg_idx on messageblks
     (cost=0.00..5716.99 rows=1453 width=40)

and returns almost instantly, whereas, just adding a limit:

     explain SELECT messageblk FROM messageblks WHERE message_idnr =
100::bigint
     ORDER BY messageblk_idnr
     limit 1;

gives:

     Limit  (cost=0.00..777.50 rows=1 width=40)
       ->  Index Scan using messageblks_id_idx on messageblks
     (cost=0.00..1129984.15 rows=1453 width=40)


which takes several minutes to run.

The relevant metadata is:

                                Table "messageblks"
      Column      |  Type  |                       Modifiers
-----------------+--------
+-------------------------------------------------------
  messageblk_idnr | bigint | not null default
nextval('messageblk_idnr_seq'::text)
  message_idnr    | bigint | not null default '0'
  messageblk      | text   | not null
  blocksize       | bigint | not null default '0'
Indexes: messageblks_msg_idx
Primary key: messageblks_pkey
Unique keys: messageblks_id_idx


Index "messageblks_id_idx"
      Column      |  Type
-----------------+--------
  messageblk_idnr | bigint
unique btree


Index "messageblks_msg_idx"
     Column    |  Type
--------------+--------
  message_idnr | bigint
btree


If anyone could explain the likely reasons for the choice, I would be very
interested. Even given the planner estimates, they don't look like sensible
choices.




----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 03 5330 3172          |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                  |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Is this planner choice easily explained?

От
Tom Lane
Дата:
Philip Warner <pjw@rhyme.com.au> writes:
> I am trying to understand some apparently odd behaviour with 7.2.1 and
> 7.2.3. Adding a LIMIT statement causes what seems like a very bad choice of
> strategy. The first statement is fine:

Could we see the pg_stats rows for the columns used in these queries?
(You have done a vacuum analyze recently, I trust...)

> If anyone could explain the likely reasons for the choice, I would be very
> interested. Even given the planner estimates, they don't look like sensible
> choices.

Why do you say that?  It's taking a plan estimated at 777 cost units
over one that is estimated at (presumably) 5793 units.

            regards, tom lane

Re: Is this planner choice easily explained?

От
Philip Warner
Дата:
>
>Could we see the pg_stats rows for the columns used in these queries?
>(You have done a vacuum analyze recently, I trust...)

Strange you should ask; it is actually taking some effort to persuade them
it's a good idea. They also don't believe that a LIMIT clause affects
strategy choice, so it's an uphill battle. As a result, my strong suspicion
is that there will be no results in pg_stat - but I have asked for the data.

I am mainly interested in understanding the output from this point of view
-- and to understand the choice that has been made in such a 'boundary
case' (when no analyze has been done).

Am I correct in my interpretation that:

     explain SELECT messageblk FROM messageblks WHERE message_idnr =
100::bigint
     ORDER BY messageblk_idnr ;

     Sort  (cost=5793.33..5793.33 rows=1453 width=40)
       ->  Index Scan using messageblks_msg_idx on messageblks
     (cost=0.00..5716.99 rows=1453 width=40)

means it expects to get 1453 rows based on a search for a specific key
(hence why it has a high cost)? Based on other tests I have done, I have
concluded that it assumes a selectivity of 0.5% for non-unique indexes - is
that right?

Whereas with:

     explain SELECT messageblk FROM messageblks WHERE message_idnr =
100::bigint
     ORDER BY messageblk_idnr
     limit 1;

     Limit  (cost=0.00..777.50 rows=1 width=40)
       ->  Index Scan using messageblks_id_idx on messageblks
     (cost=0.00..1129984.15 rows=1453 width=40)

it looks like the high cost on the last line here is based on the number of
pages/tuples in the file, and that the limit is causing 1/1453th of the
cost to be applied. It looks like it gets the 1453 as a basic default
selectivity again.

The problem I have with this is that it seems that the estimate for the
LIMIT is assuming that the first row returned will be the right one; in
fact it has no guarantee that the specified criteria will be satisfied by
the row. Is this assumption likely to be wrong when we do have stats? ISTM
that a LIMIT with a WHERE clause that is not satisfied by any index used in
the search should not be deemed to reduce the result set size as
drastically. In fact, it may actually have to scan the entire  result set
to get the row. Err...I think 8-).




----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 03 5330 3172          |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                  |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Is this planner choice easily explained?

От
Tom Lane
Дата:
Philip Warner <pjw@rhyme.com.au> writes:
> Strange you should ask; it is actually taking some effort to persuade them
> it's a good idea. They also don't believe that a LIMIT clause affects
> strategy choice, so it's an uphill battle.

If they don't believe that, they are wrong (and pretty muleheaded, to
continue disbelieving it in the face of indisputable evidence to the
contrary...).  The entire reason that the planner estimates both startup
and total cost is so that it can make an informed choice about what to do
in the presence of LIMIT.

> Am I correct in my interpretation that:

>      explain SELECT messageblk FROM messageblks WHERE message_idnr =
> 100::bigint
>      ORDER BY messageblk_idnr ;

>      Sort  (cost=5793.33..5793.33 rows=1453 width=40)
>        ->  Index Scan using messageblks_msg_idx on messageblks
>      (cost=0.00..5716.99 rows=1453 width=40)

> means it expects to get 1453 rows based on a search for a specific key
> (hence why it has a high cost)?

It looks that way.  It's always tricky to be sure what the index
condition actually is (which is why 7.3's EXPLAIN displays conditions
explicitly) --- but I'd interpret the above as an indexed search for
message_idnr = 100 (estimated to yield 1453 rows) followed by a sort
on messageblk_idnr.

> Based on other tests I have done, I have
> concluded that it assumes a selectivity of 0.5% for non-unique indexes - is
> that right?

Yeah, I think that's the default selectivity in the absence of pg_stats
data, at least in recent releases.  Look in src/backend/utils/adt/selfuncs.c.

> Whereas with:

>      explain SELECT messageblk FROM messageblks WHERE message_idnr =
> 100::bigint
>      ORDER BY messageblk_idnr
>      limit 1;

>      Limit  (cost=0.00..777.50 rows=1 width=40)
>        ->  Index Scan using messageblks_id_idx on messageblks
>      (cost=0.00..1129984.15 rows=1453 width=40)

> it looks like the high cost on the last line here is based on the number of
> pages/tuples in the file, and that the limit is causing 1/1453th of the
> cost to be applied. It looks like it gets the 1453 as a basic default
> selectivity again.

What is presumably happening here is that the index is being used only
to retrieve rows in messageblk_idnr order.  There is no index condition
per se (ie, the whole index is potentially scanned) --- but there is a
filter condition ("qpqual") for message_idnr = 100 on the indexscan's
output, so the indexscan node is still estimated to yield the same 1453
rows.  But notice the very high cost to run the indexscan to completion;
under the hood it'd be visiting all the rows.

The reason this plan is chosen is that because of the LIMIT 1, it's not
going to be run to completion.  Essentially, the system is betting that
scanning the messageblk_idnr index in order till it hits the first row
with message_idnr = 100 will be quicker than finding and sorting all the
rows with message_idnr = 100 to get the one with lowest messageblk_idnr.

This bet is evidently wrong, but it's hard to tell whether it's wrong
because no statistics are available, or because the system isn't making
the right deductions from the stats it has, or because the stats aren't
adequate to model the situation.  (For example, we currently do not have
any cross-column correlation stats.  If message_idnr and messageblk_idnr
are strongly correlated, which I'm suspecting is likely, the rows with
message_idnr = 100 would not be randomly scattered in the
messageblk_idnr index --- but the system is assuming they will be in
order to estimate how long it will take to find the first one.)

            regards, tom lane

Re: Is this planner choice easily explained?

От
Philip Warner
Дата:
At 10:38 AM 22/11/2002 -0500, Tom Lane wrote:
>If they don't believe that, they are wrong (and pretty muleheaded, to
>continue disbelieving it in the face of indisputable evidence to the
>contrary...).

I am not at all sure what their mindset is -- I may have just misunderstood
their responses. Very odd. Up until now I thought I held the record for
mule-headedness on this sort of thing.


>This bet is evidently wrong, but it's hard to tell whether it's wrong
>because no statistics are available, or because the system isn't making
>the right deductions from the stats it has, or because the stats aren't
>adequate to model the situation.

Based on my own database with about 30GB of mail, the planner is easily
capable of coming up with the right strategy here -- so long as an analyze
is done. I have managed to construct trivial examples that exactly mirror
the behaviour above, but which work properly after an analyze.


>(For example, we currently do not have
>any cross-column correlation stats.  If message_idnr and messageblk_idnr
>are strongly correlated, which I'm suspecting is likely, the rows with
>message_idnr = 100 would not be randomly scattered in the
>messageblk_idnr index --- but the system is assuming they will be in
>order to estimate how long it will take to find the first one.)

Indeed; I think this is exactly what we are seeing.

Thanks very much for the insights; I have relayed this to the other list --
I'll let you know if anything unexpected happens.




----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 03 5330 3172          |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                  |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/