Обсуждение: How can I speed up this search?

Поиск
Список
Период
Сортировка

How can I speed up this search?

От
Oliver Elphick
Дата:
I am searching on a FULL JOIN of two tables (using 7.2.1).  VACUUM
ANALYZE has been done.

If I search on either of them separately, the search uses the
appropriate index, but when they are joined, the second table uses a
sequential scan and takes far longer than (I think) it ought.  Can I
reshape the query or add additional indexes to improve the planner's
guess?

    bray=# explain analyze  SELECT '' AS van,
    bray-#        COALESCE(a.year, f.year) AS year,
    bray-#        COALESCE(a.month, f.month) AS month,
    bray-#        COALESCE(f.qty,0) AS fc_qty, COALESCE(a.qty,0) AS
    qty,        COALESCE(a.home_inv_qty,0) AS home_inv_qty,
    bray-#        COALESCE(a.exp_inv_qty,0) AS exp_inv_qty,
    bray-#        COALESCE(f.revenue,0) AS revenue,
    bray-#        COALESCE(a.ccy, 'GBP') AS ccy,
    bray-#        COALESCE(a.val, 0.) AS val,
    bray-#        COALESCE(a.home_inv_val, 0.) AS home_inv_val,
    bray-#        COALESCE(a.exp_inv_val, 0.) AS exp_inv_val
    bray-#   FROM stock_allocation AS a
    bray-#        FULL JOIN sales_forecast AS f
    bray-#          ON a.product = f.product AND
    bray-#             a.year = f.year AND
    bray-#             a.month = f.month AND
    bray-#             a.ccy = f.ccy
    bray-#   WHERE a.product = 'CC009' OR f.product = 'CC009';
    NOTICE:  QUERY PLAN:

    Merge Join  (cost=49104.90..51498.34 rows=241423 width=101) (actual
    time=31518.32..33565.86 rows=48 loops=1)
      ->  Index Scan using sales_forecast_pkey on sales_forecast f
    (cost=0.00..1543.00 rows=27528 width=40) (actual time=0.27..278.38
    rows=27528 loops=1)
      ->  Sort  (cost=49104.90..49104.90 rows=241423 width=61) (actual
    time=29576.37..30185.22 rows=241423 loops=1)
            ->  Seq Scan on stock_allocation a  (cost=0.00..5580.23
    rows=241423 width=61) (actual time=0.07..2772.26 rows=241423
    loops=1)
    Total runtime: 38289.59 msec

    EXPLAIN
    bray=# \d stock_allocation
                     Table "stock_allocation"
        Column    |         Type          |     Modifiers
    --------------+-----------------------+--------------------
     product      | character varying(10) | not null
     year         | smallint              | not null
     month        | smallint              | not null
     ccy          | character(3)          | not null
     qty          | integer               | not null default 0
     val          | numeric(12,2)         | default 0
     home_inv_qty | integer               | default 0
     home_inv_val | numeric(12,2)         | default 0
     exp_inv_qty  | integer               | default 0
     exp_inv_val  | numeric(12,2)         | default 0
    Indexes: product_currency_index
    Primary key: stock_allocation_pkey
    Check constraints: "month in range" (("month" >= 1) AND
    (((float8("year") < date_part('year'::text,
    ('now'::text)::timestamp(6) with time zone)) AND ("month" <= 12)) OR
    ((float8("year") = date_part('year'::text,
    ('now'::text)::timestamp(6) with time zone)) AND (float8("month") <=
    date_part('month'::text, ('now'::text)::timestamp(6) with time
    zone)))))
                       "year in range" (("year" > 1987) AND
    (float8("year") <= date_part('year'::text,
    ('now'::text)::timestamp(6) with time zone)))
    Triggers: RI_ConstraintTrigger_26246616,
              RI_ConstraintTrigger_26246610

    bray=# \d stock_allocation_pkey
      Index "stock_allocation_pkey"
     Column  |         Type
    ---------+-----------------------
     product | character varying(10)
     year    | smallint
     month   | smallint
     ccy     | character(3)
    unique btree (primary key)

    bray=# \d product_currency_index
     Index "product_currency_index"
     Column  |         Type
    ---------+-----------------------
     product | character varying(10)
     ccy     | character(3)
    btree


    bray=# explain analyze select * from stock_allocation where product
    = 'CC009';
    NOTICE:  QUERY PLAN:

    Index Scan using product_currency_index on stock_allocation
    (cost=0.00..204.90 rows=68 width=61) (actual time=0.43..1.33 rows=48
    loops=1)
    Total runtime: 1.54 msec

    EXPLAIN
    bray=# \d sales_forecast
               Table "sales_forecast"
     Column  |         Type          | Modifiers
    ---------+-----------------------+-----------
     product | character varying(10) | not null
     year    | integer               | not null
     month   | integer               | not null
     ccy     | character(3)          | not null
     qty     | integer               | not null
     revenue | numeric(12,2)         | not null
    Primary key: sales_forecast_pkey
    Check constraints: "must be sterling" (ccy = 'GBP'::bpchar)
                       "month in range" (("month" >= 1) AND ("month" <=
    12))
                       "year in range" (("year" > 2001) AND
    (float8("year") < (date_part('year'::text, date('now'::text)) + 3)))
    Triggers: RI_ConstraintTrigger_26246754

    bray=# \d sales_forecast_pkey
       Index "sales_forecast_pkey"
     Column  |         Type
    ---------+-----------------------
     product | character varying(10)
     year    | integer
     month   | integer
     ccy     | character(3)
    unique btree (primary key)


    bray=# explain select * from sales_forecast where product = 'CC009';
    NOTICE:  QUERY PLAN:

    Index Scan using sales_forecast_pkey on sales_forecast
    (cost=0.00..92.57 rows=25 width=40)

    EXPLAIN


Forcing use of the index in the join does not help; it takes slightly
longer, so the planner is correct.  With ENABLE_SEQSCAN off:

    NOTICE:  QUERY PLAN:

    Merge Join  (cost=544454.29..546847.73 rows=241423 width=101)
    (actual time=31523.99..33582.48 rows=48 loops=1)
      ->  Index Scan using sales_forecast_pkey on sales_forecast f
    (cost=0.00..1543.00 rows=27528 width=40) (actual time=0.26..283.75
    rows=27528 loops=1)
      ->  Sort  (cost=544454.29..544454.29 rows=241423 width=61) (actual
    time=29593.48..30177.42 rows=241423 loops=1)
            ->  Index Scan using product_currency_index on
    stock_allocation a  (cost=0.00..500929.62 rows=241423 width=61)
    (actual time=35.60..3539.31 rows=241423 loops=1)
    Total runtime: 38485.57 msec

    EXPLAIN

It seems to me that something is being done in the wrong order, since
there are actually 25 rows to be joined with 48, but the planner is
expecting to join 27528 rows with 241423.  Should it not plan to do the
selection before doing the join?  Or what am I missing?

--
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C

     "Dearly beloved, avenge not yourselves, but rather give
      place unto wrath. For it is written, Vengeance is
      mine; I will repay, saith the Lord. Therefore if thine
      enemy hunger, feed him; if he thirst, give him drink;
      for in so doing thou shalt heap coals of fire on his
      head. Be not overcome of evil, but overcome evil with
      good."      Romans 12:19-21

Вложения

Re: How can I speed up this search?

От
Tom Lane
Дата:
Oliver Elphick <olly@lfix.co.uk> writes:
> I am searching on a FULL JOIN of two tables (using 7.2.1).  VACUUM
> ANALYZE has been done.

> If I search on either of them separately, the search uses the
> appropriate index, but when they are joined, the second table uses a
> sequential scan and takes far longer than (I think) it ought.  Can I
> reshape the query or add additional indexes to improve the planner's
> guess?

I am guessing that the problem has something to do with the fact that
the indexed columns are integer in one table and smallint in the other.
Yet in my tests both 7.2 and current sources can figure out how to do an
indexed mergejoin between integer and smallint columns.  Strange that
it's not doing that for you.

Can you try (a) remaking the tables with identical column types;
(b) removing the product_currency_index index to see if the planner
will use the other index when it has no choice?

            regards, tom lane