Stored proc and optimizer question

Поиск
Список
Период
Сортировка
От Antal Attila
Тема Stored proc and optimizer question
Дата
Msg-id 43F3251F.9080901@ritek.hu
обсуждение исходный текст
Список pgsql-performance
Hi!

I have a question about the query optimizer and the function scan. See
the next case:

CREATE TABLE a (id SERIAL PRIMARY KEY, userid INT4, col  TEXT);
CREATE TABLE b (id SERIAL PRIMARY KEY, userid INT4, a_id INT4 REFERENCES
a (id), col  TEXT);
CREATE INDEX idx_a_uid ON a(userid);
CREATE INDEX idx_b_uid ON b(userid);
CREATE INDEX idx_a_col ON a(col);
CREATE INDEX idx_b_col ON b(col);

First solution:

    CREATE VIEW ab_view AS
        SELECT a.id AS id,
                       a.userid AS userid_a, b.userid AS userid_b,
                       a.col AS col_a, b.col AS col_b
        FROM a LEFT JOIN b ON (a.id = b.a_id);

    EXPLAIN ANALYSE SELECT * FROM ab_view
        WHERE userid_a = 23 AND userid_b = 23 AND col_a LIKE 's%'
        ORDER BY col_b
        LIMIT 10 OFFSET 10;

                                                                QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=15.70..15.70 rows=1 width=76) (actual time=0.108..0.108
rows=0 loops=1)
   ->  Sort  (cost=15.69..15.70 rows=1 width=76) (actual
time=0.104..0.104 rows=0 loops=1)
         Sort Key: b.col
         ->  Nested Loop  (cost=3.32..15.68 rows=1 width=76) (actual
time=0.085..0.085 rows=0 loops=1)
               Join Filter: ("outer".id = "inner".a_id)
               ->  Bitmap Heap Scan on a  (cost=2.30..6.13 rows=1
width=40) (actual time=0.082..0.082 rows=0 loops=1)
                     Recheck Cond: (userid = 23)
                     Filter: (col ~~ 's%'::text)
                     ->  BitmapAnd  (cost=2.30..2.30 rows=1 width=0)
(actual time=0.077..0.077 rows=0 loops=1)
                           ->  Bitmap Index Scan on idx_a_uid
(cost=0.00..1.02 rows=6 width=0) (actual time=0.075..0.075 rows=0 loops=1)
                                 Index Cond: (userid = 23)
                           ->  Bitmap Index Scan on idx_a_col
(cost=0.00..1.03 rows=6 width=0) (never executed)
                                 Index Cond: ((col >= 's'::text) AND
(col < 't'::text))
               ->  Bitmap Heap Scan on b  (cost=1.02..9.49 rows=5
width=40) (never executed)
                     Recheck Cond: (userid = 23)
                     ->  Bitmap Index Scan on idx_b_uid
(cost=0.00..1.02 rows=5 width=0) (never executed)
                           Index Cond: (userid = 23)
 Total runtime: 0.311 ms


In the first solution the query optimizer can work on the view and the
full execution of the query will be optimal. But I have to use 2
condition for the userid fields (userid_a = 23 AND userid_b = 23 ). If I
have to eliminate the duplication I can try to use stored function.

Second solution:
    CREATE FUNCTION ab_select(INT4)  RETURNS setof ab_view AS $$
        SELECT a.id AS id,
                       a.userid AS userid_a, b.userid AS userid_b,
                       a.col AS col_a, b.col AS col_b
        FROM a LEFT JOIN b ON (a.id = b.a_id AND b.userid = $1)
        WHERE a.userid = $1;
    $$ LANGUAGE SQL STABLE;

    EXPLAIN ANALYSE SELECT * FROM ab_select(23)
        WHERE col_a LIKE 's%'
        ORDER BY col_b
        LIMIT 10 OFFSET 10;

                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Limit  (cost=15.07..15.07 rows=1 width=76) (actual time=1.034..1.034
rows=0 loops=1)
   ->  Sort  (cost=15.06..15.07 rows=5 width=76) (actual
time=1.030..1.030 rows=0 loops=1)
         Sort Key: col_b
         ->  Function Scan on ab_select  (cost=0.00..15.00 rows=5
width=76) (actual time=1.004..1.004 rows=0 loops=1)
               Filter: (col_a ~~ 's%'::text)
 Total runtime: 1.103 ms

The second solution have 2 advantage:
  1. The second query is more beautiful and shorter.
  2. You can rewrite easier the stored function without modify the query.

But I have heartache, because the optimizer give up the game. It cannot
optimize the query globally (inside and outside the stored function) in
spite of the STABLE keyword. It use function scan on the result of the
stored function.

How can I eliminate the function scan while I want to keep the advantages?

In my opinion the optimizer cannot replace the function scan with a more
optimal plan, but this feature may be implemented in the next versions
of  PostgreSQL. I would like to suggest this.

I built this case theoretically, but I have more stored procedure which
works with bad performance therefore.

Regards,
Antal Attila

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Gregory Maxwell"
Дата:
Сообщение: Re: [HACKERS] qsort again (was Re: Strange Create
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Index Choice Problem