Обсуждение: Interesting query plan change linked to the LIMIT parameter

Поиск
Список
Период
Сортировка

Interesting query plan change linked to the LIMIT parameter

От
"Yannick Le Guédart"
Дата:
Greetings,

I'm experiencing a strange query plan change on a "simple" request, based on the LIMIT parameter. I present here two tables, named _article and _comment. they are part of a much larger database.

It's a tree-like database (making use of the ltree to keep the coherence), and in our case, an article can have any number of comments (linked by _article.id = _comment.parent_id), and a _comment can itself have ONE comment (at most).

The _article table contains 12 millions t-uples, and the _comment table around 26 millions. The server runs a postgresql 8.3.5 in its 64bits version.

Here are the tables definition :

-- _article table

CREATE TABLE "ob2"."_article" (
    "id" bigint NOT NULL DEFAULT nextval('element_id_sequence'::regclass),
    "parent_id" bigint,
    "path" ltree,
    "data" text,
    "date_creation" timestamp without time zone NOT NULL DEFAULT now(),
    "date_publishing" timestamp without time zone NOT NULL DEFAULT now(),
    "date_modification" timestamp without time zone NOT NULL DEFAULT now(),
    "counters" hstore,
    "reference" integer NOT NULL DEFAULT nextval('_article_reference_seq'::regclass),
    "title" character varying NOT NULL,
    "text" text,
    "blog_id" bigint,
    "user_id" bigint,
    "site_id" bigint,
    "topic_id" bigint,
    "community_id" bigint,
    CONSTRAINT "_article_pkey" PRIMARY KEY (id)
) WITHOUT OIDS;
ALTER TABLE ONLY "ob2"."_article" ALTER COLUMN "path" SET STORAGE PLAIN;
ALTER TABLE ONLY "ob2"."_article" ALTER COLUMN "title" SET STORAGE PLAIN;

-- Indexes

CREATE UNIQUE INDEX _article_pkey ON _article USING btree (id);
CREATE INDEX gist_idx_article_path ON _article USING gist (path);
CREATE INDEX idx_article_blog_id ON _article USING btree (blog_id);
CREATE INDEX idx_article_community_id ON _article USING btree (community_id);
CREATE INDEX idx_article_date_creation ON _article USING btree (date_creation);
CREATE INDEX idx_article_date_modification ON _article USING btree (date_modification);
CREATE INDEX idx_article_date_publishing ON _article USING btree (date_publishing);
CREATE INDEX idx_article_parent_id ON _article USING btree (parent_id);
CREATE UNIQUE INDEX idx_article_reference_unique ON _article USING btree (reference);
CREATE INDEX idx_article_site_id ON _article USING btree (site_id);
CREATE INDEX idx_article_topic_id ON _article USING btree (topic_id);
CREATE INDEX idx_article_user_id ON _article USING btree (user_id);

-- _comment table

CREATE TABLE "ob2"."_comment" (
    "id" bigint NOT NULL DEFAULT nextval('element_id_sequence'::regclass),
    "parent_id" bigint,
    "path" ltree,
    "data" text,
    "date_creation" timestamp without time zone NOT NULL DEFAULT now(),
    "date_publishing" timestamp without time zone NOT NULL DEFAULT now(),
    "date_modification" timestamp without time zone NOT NULL DEFAULT now(),
    "counters" hstore,
    "reference" integer NOT NULL DEFAULT nextval('_comment_reference_seq'::regclass),
    "text" text,
    "article_id" bigint,
    "blog_id" bigint,
    "user_id" bigint,
    "site_id" bigint,
    CONSTRAINT "_comment_pkey" PRIMARY KEY (id)
) WITHOUT OIDS;
ALTER TABLE ONLY "ob2"."_comment" ALTER COLUMN "path" SET STORAGE PLAIN;

-- Indexes

CREATE UNIQUE INDEX _comment_pkey ON _comment USING btree (id);
CREATE INDEX gist_idx_comment_path ON _comment USING gist (path);
CREATE INDEX idx_comment_date_creation ON _comment USING btree (date_creation);
CREATE INDEX idx_comment_date_publishing ON _comment USING btree (date_publishing);
CREATE INDEX idx_comment_parent_id ON _comment USING btree (parent_id);
CREATE INDEX idx_comment_reference ON _comment USING btree (reference);

Now I created a function to get simply the comment reply to a given comment :

CREATE OR REPLACE FUNCTION get_comment_response (BIGINT) RETURNS _comment AS
$$
    SELECT * FROM _comment WHERE parent_id = $1;
$$
    STABLE
    COST 1
    LANGUAGE SQL;

Ok, now, all is set. I'd like to get with a simple query the comments of a given article, ordered by publishing date, as well as their replies if they exists. So I write this request :
SELECT
    _comment.id,

    (get_comment_response(_comment.id)).id AS r_id
FROM   _comment
INNER JOIN _article
    ON _article.id = _comment.parent_id
WHERE  _comment.parent_id = '17355952'
ORDER BY _comment.date_publishing ASC
OFFSET 0
LIMIT 10;
Results are good, quite fast, BUT, when executing tests I discovered something very strange. The query was fast for 3+ comments, but very slow with a limit of 1 or 2 ! Just because the query plan change :

EXPLAIN
SELECT _comment.id,
        (get_comment_response(_comment.id)).id AS r_id
FROM   _comment
INNER JOIN _article
        ON _article.id = _comment.parent_id
WHERE  _comment.parent_id = '17355952'
ORDER BY _comment.id ASC
OFFSET 0
LIMIT 1000;

                                                  QUERY PLAN                                                  
---------------------------------------------------------------------------------------------------------------
 Limit  (cost=10261.19..10263.69 rows=1000 width=8)
   ->  Sort  (cost=10261.19..10281.06 rows=7949 width=8)
         Sort Key: _comment.id
         ->  Nested Loop  (cost=0.00..9825.35 rows=7949 width=8)
               ->  Index Scan using _article_pkey on _article  (cost=0.00..9.55 rows=1 width=8)
                     Index Cond: (id = 17355952::bigint)
               ->  Index Scan using idx_comment_parent_id on _comment  (cost=0.00..9716.44 rows=7949 width=16)
                     Index Cond: (_comment.parent_id = 17355952::bigint)
(8 rows)

EXPLAIN
SELECT _comment.id,
        (get_comment_response(_comment.id)).id AS r_id
FROM   _comment
INNER JOIN _article
        ON _article.id = _comment.parent_id
WHERE  _comment.parent_id = '17355952'
ORDER BY _comment.id ASC
OFFSET 0
LIMIT 1;
                                             QUERY PLAN                                             
-----------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..3588.42 rows=1 width=8)
   ->  Nested Loop  (cost=0.00..28524312.40 rows=7949 width=8)
         ->  Index Scan using _comment_pkey on _comment  (cost=0.00..28448324.73 rows=7949 width=16)
               Filter: (parent_id = 17355952::bigint)
         ->  Index Scan using _article_pkey on _article  (cost=0.00..9.55 rows=1 width=8)
               Index Cond: (_article.id = 17355952::bigint)
(6 rows)

The second query scans the whole comment table which is very dangerous for production servers.

So did I do something wrong ? Is there a way to handle this issue smoothly ?


Thanks in advance


Yannick

Re: Interesting query plan change linked to the LIMIT parameter

От
"David Wilson"
Дата:
On Tue, Jan 20, 2009 at 10:45 AM, Yannick Le Guédart
<yannick@over-blog.com> wrote:

>
> The second query scans the whole comment table which is very dangerous for
> production servers.

That's not quite true. The second does an index scan- the planner
seems to be guessing that it'll fulfill the required limit early in
the index scan; only with a pathologically bad case would it actually
have to scan the entire thing. Basically, the second query is
optimized to spit out the first few rows quickly, since that's all you
asked for with the limit.

Note that your first query has a final cost estimate of "Limit
(cost=10261.19..10263.69 rows=1000 width=8)", indicating an estimated
10261.19 to emit the first row; the second has "Limit
(cost=0.00..3588.42 rows=1 width=8)" estimating 0.00 (basically,
instant) to emit the first - and only desired - row.

That all said, an explain analyze would give us a better idea of
what's going on- we can't tell if the planner is making bad estimates
without the knowledge of what the real timing and row count results of
plan stages were.


--
- David T. Wilson
david.t.wilson@gmail.com

Re: Interesting query plan change linked to the LIMIT parameter

От
Yannick Le Guédart
Дата:
Thanks for the rapid response.

I can understand the way the planner makes its guess, but as a matter of fact, he'll be nearly always wrong, just becausethe most commented articles have only around 5000 or so comments.  I ran the explain  analyze tonight and got this results :

EXPLAIN ANALYZE SELECT _comment.id,
        (get_comment_response(_comment.id)).id AS r_id
FROM   _comment
INNER JOIN _article
        ON _article.id = _comment.parent_id
WHERE  _comment.parent_id = '17355952'
ORDER BY _comment.id ASC
OFFSET 0
LIMIT 1;
                                                                       QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..3588.42 rows=1 width=8) (actual time=498597.115..498597.116 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..28524312.40 rows=7949 width=8) (actual time=498597.114..498597.114 rows=1 loops=1)
         ->  Index Scan using _comment_pkey on _comment  (cost=0.00..28448324.73 rows=7949 width=16) (actual time=498473.360..498473.360 rows=1 loops=1)
               Filter: (parent_id = 17355952::bigint)
         ->  Index Scan using _article_pkey on _article  (cost=0.00..9.55 rows=1 width=8) (actual time=63.465..63.465 rows=1 loops=1)
               Index Cond: (_article.id = 17355952::bigint)
 Total runtime: 498615.230 ms
(7 rows)

EXPLAIN ANALYZE SELECT _comment.id,
        (get_comment_response(_comment.id)).id AS r_id
FROM   _comment
INNER JOIN _article
        ON _article.id = _comment.parent_id
WHERE  _comment.parent_id = '17355952'
ORDER BY _comment.id ASC
OFFSET 0
LIMIT 1000;
                                                                         QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=10261.19..10263.69 rows=1000 width=8) (actual time=127.037..127.267 rows=1000 loops=1)
   ->  Sort  (cost=10261.19..10281.06 rows=7949 width=8) (actual time=127.036..127.128 rows=1000 loops=1)
         Sort Key: _comment.id
         Sort Method:  top-N heapsort  Memory: 95kB
         ->  Nested Loop  (cost=0.00..9825.35 rows=7949 width=8) (actual time=0.472..122.986 rows=4674 loops=1)
               ->  Index Scan using _article_pkey on _article  (cost=0.00..9.55 rows=1 width=8) (actual time=0.011..0.013 rows=1 loops=1)
                     Index Cond: (id = 17355952::bigint)
               ->  Index Scan using idx_comment_parent_id on _comment  (cost=0.00..9716.44 rows=7949 width=16) (actual time=0.235..32.869 rows=4674 loops=1)
                     Index Cond: (_comment.parent_id = 17355952::bigint)
 Total runtime: 127.410 ms
(10 rows)

As you can see, the time is dramaticaly longuer with the LIMIT 1 (or in our case, LIMIT 2).

Yannick.

2009/1/20 David Wilson <david.t.wilson@gmail.com>
On Tue, Jan 20, 2009 at 10:45 AM, Yannick Le Guédart
<yannick@over-blog.com> wrote:

>
> The second query scans the whole comment table which is very dangerous for
> production servers.

That's not quite true. The second does an index scan- the planner
seems to be guessing that it'll fulfill the required limit early in
the index scan; only with a pathologically bad case would it actually
have to scan the entire thing. Basically, the second query is
optimized to spit out the first few rows quickly, since that's all you
asked for with the limit.

Note that your first query has a final cost estimate of "Limit
(cost=10261.19..10263.69 rows=1000 width=8)", indicating an estimated
10261.19 to emit the first row; the second has "Limit
(cost=0.00..3588.42 rows=1 width=8)" estimating 0.00 (basically,
instant) to emit the first - and only desired - row.

That all said, an explain analyze would give us a better idea of
what's going on- we can't tell if the planner is making bad estimates
without the knowledge of what the real timing and row count results of
plan stages were.


--
- David T. Wilson
david.t.wilson@gmail.com