Query optimization problem

От: Zotov
Тема: Query optimization problem
Дата: ,
Msg-id: 4C3EA6CB.4070502@oe-it.ru
(см: обсуждение, исходный текст)
Ответы: Re: Query optimization problem  (Yeb Havinga)
Список: pgsql-performance

I have a query:

  SELECT d1.ID, d2.ID
  FROM DocPrimary d1
    JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
  WHERE (d1.ID=234409763) or (d2.ID=234409763)

i think what QO(Query Optimizer) can make it faster (now it seq scan and on
million records works 7 sec)

  SELECT d1.ID, d2.ID
  FROM DocPrimary d1
    JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
  WHERE (d2.BasedOn=234409763) or (d2.ID=234409763)


  ----------------------
  Slow Query
  ----------------------
  test=# EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS off )SELECT d1.ID,
  d2.ID
  test-# FROM DocPrimary d1
  test-#   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
  test-# WHERE (d1.ID=234409763) or (d2.ID=234409763);
                                                          QUERY PLAN

------------------------------------------------------------------------------------------------------------------------
    Hash Join  (cost=58.15..132.35 rows=2 width=8) (actual time=0.007..0.007
  rows=0 loops=1)
      Output: d1.id, d2.id
      Hash Cond: (d2.basedon = d1.id)
      Join Filter: ((d1.id = 234409763) OR (d2.id = 234409763))
      ->   Seq Scan on public.docprimary d2  (cost=0.00..31.40 rows=2140
  width=8) (actual time=0.002..0.002 rows=0 loops=1)
            Output: d2.id, d2.basedon
      ->   Hash  (cost=31.40..31.40 rows=2140 width=4) (never executed)
            Output: d1.id
            ->   Seq Scan on public.docprimary d1  (cost=0.00..31.40 rows=2140
  width=4) (never executed)
                  Output: d1.id

  ------------------
  Fast Query
  ------------------
  test=# EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS off )SELECT d1.ID,
  d2.ID
  test-# FROM DocPrimary d1
  test-#   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
  test-# WHERE (d2.BasedOn=234409763) or (d2.ID=234409763);
                                                              QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------
    Nested Loop  (cost=8.60..58.67 rows=12 width=8) (actual time=0.026..0.026
  rows=0 loops=1)
      Output: d1.id, d2.id
      ->   Bitmap Heap Scan on public.docprimary d2  (cost=8.60..19.31 rows=12
  width=8) (actual time=0.023..0.023 rows=0 loops=1)
            Output: d2.id, d2.basedon
            Recheck Cond: ((d2.basedon = 234409763) OR (d2.id = 234409763))
            ->   BitmapOr  (cost=8.60..8.60 rows=12 width=0) (actual
  time=0.018..0.018 rows=0 loops=1)
                  ->   Bitmap Index Scan on basedon_idx  (cost=0.00..4.33
  rows=11 width=0) (actual time=0.008..0.008 rows=0 loops=1)
                        Index Cond: (d2.basedon = 234409763)
                  ->   Bitmap Index Scan on id_pk  (cost=0.00..4.26 rows=1
  width=0) (actual time=0.003..0.003 rows=0 loops=1)
                        Index Cond: (d2.id = 234409763)
      ->   Index Scan using id_pk on public.docprimary d1  (cost=0.00..3.27
  rows=1 width=4) (never executed)
            Output: d1.id, d1.basedon
            Index Cond: (d1.id = d2.basedon)


--------------------------------------------
PGver: PostgreSQL 9.0b x86
OS: Win7 x64

---------------------
Create table query:
---------------------

CREATE TABLE docprimary
(
   id integer NOT NULL,
   basedon integer,
   CONSTRAINT id_pk PRIMARY KEY (id)
);
CREATE INDEX basedon_idx
   ON docprimary
   USING btree
   (basedon);



В списке pgsql-performance по дате сообщения:

От: Yeb Havinga
Дата:
Сообщение: Re: Query optimization problem
От: Patrick Donlin
Дата:
Сообщение: Identical query slower on 8.4 vs 8.3