Обсуждение: planner index choice

Поиск
Список
Период
Сортировка

planner index choice

От
Chris
Дата:
Hi there,

I have a simple query where I don't understand the planner's choice to
use a particular index.

The main table looks like this:

# \d sq_ast_attr_val
                    Table "public.sq_ast_attr_val"
    Column    |         Type          |          Modifiers
-------------+-----------------------+------------------------------
  assetid     | character varying(15) | not null
  attrid      | integer               | not null
  contextid   | integer               | not null default 0
  custom_val  | text                  |
  use_default | character(1)          | not null default '1'::bpchar
Indexes:
     "ast_attr_val_pk" PRIMARY KEY, btree (assetid, attrid, contextid)
     "sq_ast_attr_val_assetid" btree (assetid)
     "sq_ast_attr_val_attrid" btree (attrid)
     "sq_ast_attr_val_concat" btree (((assetid::text || '~'::text) ||
attrid))
     "sq_ast_attr_val_contextid" btree (contextid)


The query:

SELECT
   assetid, custom_val
FROM
   sq_ast_attr_val
WHERE
   attrid IN (SELECT attrid FROM sq_ast_attr WHERE name =
'is_contextable' AND (type_code = 'metadata_field_select' OR
owning_type_code = 'metadata_field'))
   AND contextid = 0
INTERSECT
SELECT
   assetid, custom_val
FROM
   sq_ast_attr_val
WHERE
   assetid = '62321'
   AND contextid = 0;


The explain analyze plan:

http://explain.depesz.com/s/nWs

I'm not sure why it's picking the sq_ast_attr_val_contextid index to do
the contextid = 0 check, the other parts (attrid/assetid) are much more
selective.

If I drop that particular index:

http://explain.depesz.com/s/zp


All (I hope) relevant postgres info:

Centos 5.5 x86_64 running pg8.4.4.

Server has 8gig memory.

# select name, setting, source from pg_settings where name in
('shared_buffers', 'effective_cache_size', 'work_mem');
          name         | setting
----------------------+--------
shared_buffers        | 262144
effective_cache_size  | 655360
work_mem              | 32768

All planner options are enabled:

# select name, setting, source from pg_settings where name like 'enable_%';
        name        | setting | source
-------------------+---------+---------
  enable_bitmapscan | on      | default
  enable_hashagg    | on      | default
  enable_hashjoin   | on      | default
  enable_indexscan  | on      | default
  enable_mergejoin  | on      | default
  enable_nestloop   | on      | default
  enable_seqscan    | on      | default
  enable_sort       | on      | default
  enable_tidscan    | on      | default

Any insights welcome - thanks!

--
Postgresql & php tutorials
http://www.designmagick.com/


Re: planner index choice

От
Tom Lane
Дата:
Chris <dmagick@gmail.com> writes:
> The query:

> SELECT
>    assetid, custom_val
> FROM
>    sq_ast_attr_val
> WHERE
>    attrid IN (SELECT attrid FROM sq_ast_attr WHERE name =
> 'is_contextable' AND (type_code = 'metadata_field_select' OR
> owning_type_code = 'metadata_field'))
>    AND contextid = 0
> INTERSECT
> SELECT
>    assetid, custom_val
> FROM
>    sq_ast_attr_val
> WHERE
>    assetid = '62321'
>    AND contextid = 0;

> The explain analyze plan:
> http://explain.depesz.com/s/nWs

Hrm ... are you *certain* that's an 8.4 server?  Because the bit with

    Index Cond: (sq_ast_attr_val.attrid = "outer".attrid)

is a locution that EXPLAIN hasn't used since 8.1, according to a quick
check.  More recent versions don't say "outer".

The actual problem seems to be that choose_bitmap_and() is choosing to
add an indexscan on sq_ast_attr_val_contextid, even though this index
is a lot less selective than the sq_ast_attr_val_attrid scan it had
already picked.  I've seen that behavior before, and there were a series
of patches back in 2006-2007 that seem to have pretty much fixed it.
So that's another reason for suspecting you've got an old server version
there...

            regards, tom lane

Re: planner index choice

От
Chris
Дата:
Hi,

> Hrm ... are you *certain* that's an 8.4 server?

Yep.

# psql -U postgres -d db
psql (8.4.4)

db=# select version();
                                                      version

------------------------------------------------------------------------------------------------------------------
  PostgreSQL 8.4.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC)
4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit
(1 row)


> The actual problem seems to be that choose_bitmap_and() is choosing to
> add an indexscan on sq_ast_attr_val_contextid, even though this index
> is a lot less selective than the sq_ast_attr_val_attrid scan it had
> already picked.  I've seen that behavior before, and there were a series
> of patches back in 2006-2007 that seem to have pretty much fixed it.
> So that's another reason for suspecting you've got an old server version
> there...

I just recreated the index and re-ran the explain analyze and it doesn't
give the "outer" bit any more - not sure how I got that before.

db=# begin;
BEGIN
db=# create index attr_val_contextid on sq_ast_attr_val(contextid);
CREATE INDEX
db=# analyze sq_ast_attr_val;
ANALYZE
db=# explain analyze SELECT
db-#     assetid, custom_val
db-# FROM
db-#     sq_ast_attr_val
db-# WHERE
db-#     attrid IN (SELECT attrid FROM sq_ast_attr WHERE name =
db(# 'is_contextable' AND (type_code = 'metadata_field_select' OR
db(# owning_type_code = 'metadata_field'))
db-#    AND contextid = 0
db-# INTERSECT
db-# SELECT
db-#    assetid, custom_val
db-# FROM
db-#    sq_ast_attr_val
db-# WHERE
db-#    assetid = '62321'
db-#    AND contextid = 0;

http://explain.depesz.com/s/br9

Without that index (again with an analyze after doing a rollback):

http://explain.depesz.com/s/gxH

--
Postgresql & php tutorials
http://www.designmagick.com/


Re: planner index choice

От
tv@fuzzy.cz
Дата:
> http://explain.depesz.com/s/br9
> http://explain.depesz.com/s/gxH

Well, I don't have time to do a thorough analysis right now, but in all
the plans you've posted there are quite high values in the "Rows x" column
(e.g. the 5727.5 value).

That means a significant difference in estimated and actual row number,
which may lead to poor choice of indexes etc. The planner may simply think
the index is better due to imprecise statistics etc.

Try to increase te statistics target for the columns, e.g.

ALTER TABLE table ALTER COLUMN column SET STATISTICS integer

where "integer" is between 0 and 1000 (the default value is 10 so use 100
or maybe 1000), run analyze and try to run the query again.

Tomas


Re: planner index choice

От
Tom Lane
Дата:
tv@fuzzy.cz writes:
>> http://explain.depesz.com/s/br9
>> http://explain.depesz.com/s/gxH

> Well, I don't have time to do a thorough analysis right now, but in all
> the plans you've posted there are quite high values in the "Rows x" column
> (e.g. the 5727.5 value).

> That means a significant difference in estimated and actual row number,
> which may lead to poor choice of indexes etc. The planner may simply think
> the index is better due to imprecise statistics etc.

Yeah.  The sq_ast_attr_val_attrid scan is a lot more selective than the
planner is guessing (3378 rows estimated vs an average of 15 actual),
and I think that is making the difference.  If you look at the estimated
row counts and costs, it's expecting that adding the second index will
cut the number of heap fetches about 7x, hence saving somewhere around
4800 cost units in the heapscan step, more than it thinks the indexscan
will cost.  But taking 15 row fetches down to 2 isn't nearly enough to
pay for the extra indexscan.

> Try to increase te statistics target for the columns, e.g.
> ALTER TABLE table ALTER COLUMN column SET STATISTICS integer

It's worth a try but I'm not sure how much it'll help.  A different line
of attack is to play with the planner cost parameters.  In particular,
reducing random_page_cost would reduce the estimated cost of the heap
fetches and thus discourage it from using the extra index.  If you're
working with mostly-cached tables then this would probably improve
behavior overall, too.

            regards, tom lane