functional index not used, looping simpler query just faster

Поиск
Список
Период
Сортировка
От Ivan Sergio Borgonovo
Тема functional index not used, looping simpler query just faster
Дата
Msg-id 20080710114040.33f8c07c@dawn.webthatworks.it
обсуждение исходный текст
Ответы Re: functional index not used, looping simpler query just faster  (Martijn van Oosterhout <kleptog@svana.org>)
Re: functional index not used, looping simpler query just faster  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
I've this:

CREATE TABLE catalog_brands
(
  brandid serial NOT NULL,
  "name" character varying(64) NOT NULL,
  delivery smallint NOT NULL DEFAULT (24 * 15),
  deliverymessage character varying(64),
  brandtypeid integer,
  brandgroupid integer,
  CONSTRAINT catalog_brands_pkey PRIMARY KEY (brandid),
  CONSTRAINT catalog_brands_brandgroupid_fkey FOREIGN KEY
(brandgroupid) REFERENCES catalog_brandgroup (brandgroupid) MATCH
SIMPLE ON UPDATE NO ACTION ON DELETE SET NULL,
  CONSTRAINT catalog_brands_brandtypeid_fkey FOREIGN KEY
(brandtypeid) REFERENCES catalog_brandtype (brandtypeid) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE SET NULL
);

CREATE INDEX catalog_brands_name_index
  ON catalog_brands
  USING btree
  (upper(name::text));

CREATE TABLE catalog_items
(
  itemid bigint NOT NULL,
  brand integer NOT NULL,
  name character varying(256) NOT NULL,
/* snip */
  datainserimento timestamp without time zone,
  dapub smallint,
  CONSTRAINT catalog_items_statusid_fkey FOREIGN KEY (statusid)
      REFERENCES catalog_item_status (statusid) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
);

CREATE INDEX catalog_items_brands_index
  ON catalog_items
  USING btree
  (upper(brands::text));

CREATE UNIQUE INDEX catalog_items_itemsid_index
  ON catalog_items
  USING btree
  (itemid);
ALTER TABLE catalog_items CLUSTER ON catalog_items_itemsid_index;

catalog_items contains ~ 650K records
catalog_brands 44 records

Now I try this:

explain select i1.brands, i1.name, i1.dataPub, i1.datainserimento
  from catalog_items i1
  inner join catalog_brands b1 on upper(i1.brands)=upper(b1.name)
  where i1.ItemID in (
  select i2.ItemID from catalog_items i2
    inner join catalog_brands b2 on upper(i2.brands)=upper(b2.name)
    where i1.brands=i2.brands
    and i2.dataPub>(now() - interval '8 month') and
i2.datainserimento>(now() - interval '6 month') order by
i2.datainserimento desc limit 3);


And I got this:

"Nested Loop  (cost=0.00..6383568361.87 rows=74378 width=82)"
"  ->  Seq Scan on catalog_brands b1  (cost=0.00..1.44 rows=44 width=18)"
"  ->  Index Scan using catalog_items_brands_index on catalog_items i1  (cost=0.00..145081069.53 rows=1690 width=82)"
"        Index Cond: (upper((i1.brands)::text) = upper(("outer".name)::text))"
"        Filter: (subplan)"
"        SubPlan"
"          ->  Limit  (cost=42906.81..42906.82 rows=1 width=16)"
"                ->  Sort  (cost=42906.81..42906.82 rows=1 width=16)"
"                      Sort Key: i2.datainserimento"
"                      ->  Nested Loop  (cost=0.00..42906.80 rows=1 width=16)"
"                            Join Filter: (upper(("outer".brands)::text) = upper(("inner".name)::text))"
"                            ->  Seq Scan on catalog_items i2  (cost=0.00..42904.59 rows=1 width=34)"
"                                  Filter: ((($0)::text = (brands)::text) AND (datapub > (now() - '8 mons'::interval))
AND(datainserimento > (now() - '6 mons'::interval)))" 
"                            ->  Seq Scan on catalog_brands b2  (cost=0.00..1.44 rows=44 width=18)"

I never waited enough to see results from the above... several
minutes over a 2xXeon 4Gb ram.


A simpler
select name, brands from catalog_items where
upper(brands)=upper('LARGEST GROUP') order by datainserimento desc
limit 3;

finishes in few seconds. Iterating over 44 groups does look to be
much faster than the more complicated query.

"Limit  (cost=9503.62..9503.63 rows=3 width=74)"
"  ->  Sort  (cost=9503.62..9512.08 rows=3381 width=74)"
"        Sort Key: datainserimento"
"        ->  Bitmap Heap Scan on catalog_items  (cost=29.84..9305.44 rows=3381 width=74)"
"              Recheck Cond: (upper((brands)::text) = 'CAMBRIDGE UNIVERSITY PRESS'::text)"
"              ->  Bitmap Index Scan on catalog_items_brands_index  (cost=0.00..29.84 rows=3381 width=0)"
"                    Index Cond: (upper((brands)::text) = 'CAMBRIDGE UNIVERSITY PRESS'::text)"

Even
select count(*), i1.brands from catalog_items i1
    inner join catalog_brands b1 on
upper(b1.name)=upper(i1.brands)
    group by i1.brands order by count(*)

takes from few seconds to less than 1 sec.

I could actually loop inside plpgsql but... well I'd like to
understand how things work.


--
Ivan Sergio Borgonovo
http://www.webthatworks.it


В списке pgsql-general по дате отправления:

Предыдущее
От: "Leif B. Kristensen"
Дата:
Сообщение: Re: User-Defined Variables
Следующее
От: Martijn van Oosterhout
Дата:
Сообщение: Re: functional index not used, looping simpler query just faster