Обсуждение: Index usage

Поиск
Список
Период
Сортировка

Index usage

От
Scott Matseas
Дата:
We're having a problem with one of our queries being slow. It appears to be due
to the index being used to go from tableA to tableB.

Here are the tables:
CREATE TABLE tableA
(
  table_idA int8 NOT NULL DEFAULT nextval('tableA_id_seq'::regclass),
  CONSTRAINT table_idA_pk PRIMARY KEY (table_idA),
)
WITHOUT OIDS;

CREATE TABLE tableB
(
  table_idB int8 NOT NULL DEFAULT nextval('tableB_id_seq'::regclass),
  table_idA int8 NOT NULL,
  direction char NOT NULL,
  CONSTRAINT tableB_pk PRIMARY KEY (table_idB),
  CONSTRAINT tableB_unq UNIQUE (table_idA, direction),
)
WITHOUT OIDS;

CREATE TABLE last_summarized
(
  summary_name varchar(64) NOT NULL,
  summarized_id int8,
  max_session_id int8,
  CONSTRAINT last_summarized_pk PRIMARY KEY (summary_name)
)
WITHOUT OIDS;

Here is the query:
explain
 SELECT * FROM
   last_summarized ls
   JOIN tableA s ON s.table_idA > ls.summarized_id AND s.table_idA
 <= ls.max_session_id
   LEFT JOIN tableB sf ON s.table_idA = sf.table_idA AND sf.direction = 'a'::"char"
   LEFT JOIN tableB sfb ON s.table_idA = sfb.table_idA AND sfb.direction = 'b'::"char"
  WHERE ls.summary_name::text = 'summary'::text

Size of tables in # of rows
tableA: 9,244,816
tableB: 15,398,497
last_summarized: 1


Explain of the above query:
"Hash Left Join  (cost=1811349.31..18546527.89 rows=1029087 width=294)"
"  Hash Cond: ("outer".table_idA = "inner".table_idA)"
"  ->  Hash Left Join  (cost=915760.88..7519203.61 rows=1029087 width=219)"
"        Hash Cond: ("outer".table_idA = "inner".table_idA)"
"        ->  Nested Loop  (cost=0.00..126328.57 rows=1029087 width=144)"
"              ->  Index Scan using last_summarized_pk on last_summarized ls  (cost=0.00..5.98 rows=1 width=82)"
"                    Index Cond: ((summary_name)::text = 'summary'::text)"
"              ->  Index Scan using table_idA_pk on tableA s  (cost=0.00..110886.29 rows=1029087 width=62)"
"                    Index Cond: ((s.table_idA > "outer".summarized_id) AND (s.table_idA <= "outer".max_session_id))"
"        ->  Hash  (cost=784763.16..784763.16 rows=8100289 width=75)"
"              ->  Bitmap Heap Scan on tableB sf  (cost=216418.55..784763.16 rows=8100289 width=75)"
"                    Recheck Cond: (direction = 'a'::"char")"
"                    ->  Bitmap Index Scan on tableB_unq  (cost=0.00..216418.55 rows=8100289 width=0)"
"                          Index Cond: (direction = 'a'::"char")"     <------ USING part of Index
"  ->  Hash  (cost=775968.61..775968.61 rows=7396725 width=75)"
"        ->  Bitmap Heap Scan on tableB sfb  (cost=216418.55..775968.61 rows=7396725 width=75)"
"              Recheck Cond: (direction = 'b'::"char")"
"              ->  Bitmap Index Scan on tableB_unq  (cost=0.00..216418.55 rows=7396725 width=0)"
"                    Index Cond: (direction = 'b'::"char")"     <------ USING part of Index

From the above explain see inline comment("<------ USING part of Index"). The table_idA column
looks like it is being ignored in the index Cond. If I enable sequential scan the Index Cond in
question gets replaced with a Seq scan.

Also if I disable enable_bitmapscan sometimes both columns of the index(tableB_unq) will be
used.

Does anyone know why we're experiencing this behavior?




Re: Index usage

От
Tom Lane
Дата:
Scott Matseas <smatseas@intrusic.com> writes:
> If I enable sequential scan the Index Cond in
> question gets replaced with a Seq scan.

What other planner parameters have you been fooling with?

With no data in the tables, I get a reasonably sane-looking plan,
so I'm thinking you've chosen bad values for something or other
(starting with enable_seqscan = off ;-))

explain
 SELECT * FROM
   last_summarized ls
   JOIN tableA s ON s.table_idA > ls.summarized_id AND s.table_idA
 <= ls.max_session_id
   LEFT JOIN tableB sf ON s.table_idA = sf.table_idA AND sf.direction = 'a'::char
   LEFT JOIN tableB sfb ON s.table_idA = sfb.table_idA AND sfb.direction = 'b'::char
  WHERE ls.summary_name::text = 'summary'::text ;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=6.16..54.51 rows=216 width=116)
   ->  Nested Loop Left Join  (cost=6.16..42.05 rows=216 width=95)
         ->  Nested Loop  (cost=6.16..29.58 rows=216 width=74)
               ->  Index Scan using last_summarized_pk on last_summarized ls  (cost=0.00..8.02 rows=1 width=66)
                     Index Cond: ((summary_name)::text = 'summary'::text)
               ->  Bitmap Heap Scan on tablea s  (cost=6.16..18.32 rows=216 width=8)
                     Recheck Cond: ((s.table_ida > ls.summarized_id) AND (s.table_ida <= ls.max_session_id))
                     ->  Bitmap Index Scan on table_ida_pk  (cost=0.00..6.16 rows=216 width=0)
                           Index Cond: ((s.table_ida > ls.summarized_id) AND (s.table_ida <= ls.max_session_id))
         ->  Index Scan using tableb_unq on tableb sfb  (cost=0.00..0.05 rows=1 width=21)
               Index Cond: ((s.table_ida = sfb.table_ida) AND (sfb.direction = 'b'::bpchar))
   ->  Index Scan using tableb_unq on tableb sf  (cost=0.00..0.05 rows=1 width=21)
         Index Cond: ((s.table_ida = sf.table_ida) AND (sf.direction = 'a'::bpchar))
(13 rows)


            regards, tom lane

Re: Index usage

От
Scott Matseas
Дата:
Tom Lane wrote:
> What other planner parameters have you been fooling with?
Hi Tom,
The other parameters that have been changed are:
set join_collapse_limit to 1
set enable_sort to off

We are using version 8.1.3. We've noticed the query plan changing depending
on the amount of data in the tables especially when the query looks at
more rows
in tableA. The parameter work_mem is set to 262,144.

Thanks,
Scott