Re: bad plan and LIMIT

Поиск
Список
Период
Сортировка
От James Nelson
Тема Re: bad plan and LIMIT
Дата
Msg-id A93636A7-4607-4F92-A29A-EAB7461AC675@photoshelter.com
обсуждение исходный текст
Ответ на bad plan and LIMIT  (James Nelson <james@photoshelter.com>)
Список pgsql-performance
I had tried using exists but both the forms of the query (with limit
and without) performed much worse.

    James

On May 1, 2009, at 4:22 AM, Adam Ruth wrote:

> You could try changing the IN to an EXISTS, that may alter how the
> optimizer weighs the limit.
>
>
> SELECT ID FROM ps_image WHERE EXISTS (SELECT null FROM
> ps_gallery_image WHERE gallery_id ='G00007ejKGoWS_cY' and image_id =
> ps_image.id) ORDER BY LOWER(FILE_NAME) ASC
>
> On 30/04/2009, at 3:51 AM, James Nelson wrote:
>
>>
>> Hi, I'm hoping you guys can help with improving this query I'm
>> having a problem with. The main problem is that the query plan
>> changes depending on the value of the LIMIT clause, with small
>> values using a poor plan and running very slowly. The two times are
>> roughly 5 minutes for the bad plan and 1.5 secs for the good plan.
>>
>> I have read a little about how the query planner takes into account
>> the limit clause, and I can see the effect this has on the costs
>> shown by explain. The problem is that the estimated cost ends up
>> being wildly inaccurate. I'm not sure if this a problem with the
>> planner or if it is something I am doing wrong on my end.
>>
>> the query (without the limit clause):
>>
>> SELECT ID FROM ps_image WHERE id IN (SELECT image_id FROM
>> ps_gallery_image WHERE gallery_id='G00007ejKGoWS_cY') ORDER BY
>> LOWER(FILE_NAME) ASC
>>
>> The ps_image table has about 24 million rows, ps_gallery_image has
>> about 14 million. The query above produces roughly 50 thousand rows.
>>
>> When looking at the explain with the limit, I can see the
>> interpolation that the planner does for the limit node (arriving at
>> a final cost of 458.32 for this example) but not sure why it is
>> inaccurate compared to the actual times.
>>
>> Thanks in advance for taking a look at this, let me know if there
>> is additional information I should provide.
>>
>> Some information about the tables  and the explains follow below.
>>
>> James Nelson
>>
>> [james@db2 ~] psql --version
>> psql (PostgreSQL) 8.3.5
>> contains support for command-line editing
>>
>> photoshelter=# \d ps_image
>>                             Table "public.ps_image"
>>  Column     |           Type           |                 Modifiers
>> ---------------+--------------------------
>> +-------------------------------------------
>> id            | character varying(16)    | not null
>> user_id       | character varying(16)    |
>> album_id      | character varying(16)    | not null
>> parent_id     | character varying(16)    |
>> file_name     | character varying(200)   |
>> file_size     | bigint                   |
>> .... 20 rows snipped ....
>> Indexes:
>>  "ps_image_pkey" PRIMARY KEY, btree (id)
>>  "i_file_name_l" btree (lower(file_name::text))
>> .... indexes, fk constraints and triggers snipped ....
>>
>> photoshelter=# \d ps_gallery_image
>>                Table "public.ps_gallery_image"
>>  Column     |           Type           |       Modifiers
>> ---------------+--------------------------+------------------------
>> gallery_id    | character varying(16)    | not null
>> image_id      | character varying(16)    | not null
>> display_order | integer                  | not null default 0
>> caption       | character varying(2000)  |
>> ctime         | timestamp with time zone | not null default now()
>> mtime         | timestamp with time zone | not null default now()
>> id            | character varying(16)    | not null
>> Indexes:
>>  "ps_gallery_image_pkey" PRIMARY KEY, btree (id)
>>  "gi_gallery_id" btree (gallery_id)
>>  "gi_image_id" btree (image_id)
>> Foreign-key constraints:
>>  "ps_gallery_image_gallery_id_fkey" FOREIGN KEY (gallery_id)
>> REFERENCES ps_gallery(id) ON DELETE CASCADE
>>  "ps_gallery_image_image_id_fkey" FOREIGN KEY (image_id) REFERENCES
>> ps_image(id) ON DELETE CASCADE
>> Triggers:
>>  ps_image_gi_sync AFTER INSERT OR DELETE OR UPDATE ON
>> ps_gallery_image FOR EACH ROW EXECUTE PROCEDURE ps_image_sync()
>>
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> explain analyze for bad plan
>>
>> photoshelter=# explain  analyze SELECT ID FROM ps_image WHERE id IN
>> (SELECT image_id FROM ps_gallery_image WHERE
>> gallery_id='G00007ejKGoWS_cY') ORDER BY LOWER(FILE_NAME) ASC limit 1;
>>                                                                       QUERY
>>  PLAN
>>
-------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Limit  (cost=0.00..458.32 rows=1 width=36) (actual
>> time=709831.847..709831.847 rows=1 loops=1)
>> ->  Nested Loop IN Join  (cost=0.00..17700128.78 rows=38620
>> width=36) (actual time=709831.845..709831.845 rows=1 loops=1)
>>       ->  Index Scan using i_file_name_l on ps_image
>> (cost=0.00..1023863.22 rows=24460418 width=36) (actual
>> time=0.063..271167.293 rows=8876340 loops=1)
>>       ->  Index Scan using gi_image_id on ps_gallery_image
>> (cost=0.00..0.85 rows=1 width=17) (actual time=0.048..0.048 rows=0
>> loops=8876340)
>>             Index Cond: ((ps_gallery_image.image_id)::text =
>> (ps_image.id)::text)
>>             Filter: ((ps_gallery_image.gallery_id)::text =
>> 'G00007ejKGoWS_cY'::text)
>> Total runtime: 709831.932 ms
>>
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =====================================================================
>> explain analyze for good plan
>>
>> photoshelter=# explain  analyze SELECT ID FROM ps_image WHERE id IN
>> (SELECT image_id FROM ps_gallery_image WHERE
>> gallery_id='G00007ejKGoWS_cY') ORDER BY LOWER(FILE_NAME) ASC limit
>> 600;
>>                                                                            QUERY
>>  PLAN
>>
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Limit  (cost=154650.99..154652.49 rows=600 width=36) (actual
>> time=1886.038..1886.404 rows=600 loops=1)
>> ->  Sort  (cost=154650.99..154747.54 rows=38619 width=36) (actual
>> time=1886.038..1886.174 rows=600 loops=1)
>>       Sort Key: (lower((ps_image.file_name)::text))
>>       Sort Method:  top-N heapsort  Memory: 75kB
>>       ->  Nested Loop  (cost=42394.02..152675.86 rows=38619
>> width=36) (actual time=135.132..1838.491 rows=50237 loops=1)
>>             ->  HashAggregate  (cost=42394.02..42780.21 rows=38619
>> width=17) (actual time=135.079..172.563 rows=50237 loops=1)
>>                   ->  Index Scan using gi_gallery_id on
>> ps_gallery_image  (cost=0.00..42271.79 rows=48891 width=17) (actual
>> time=0.063..97.539 rows=50237 loops=1)
>>                         Index Cond: ((gallery_id)::text =
>> 'G00007ejKGoWS_cY'::text)
>>             ->  Index Scan using ps_image_pkey on ps_image
>> (cost=0.00..2.83 rows=1 width=36) (actual time=0.031..0.031 rows=1
>> loops=50237)
>>                   Index Cond: ((ps_image.id)::text =
>> (ps_gallery_image.image_id)::text)
>> Total runtime: 1886.950 ms
>>
>>
>>
>>
>>
>>
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org
>> )
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>


В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: bad plan and LIMIT
Следующее
От: James Nelson
Дата:
Сообщение: Re: bad plan and LIMIT