Обсуждение: Need to increase performance of a query

Поиск

Список

Период

Сортировка

Need to increase performance of a query

От

Anne Rosset

Дата:

10 июня 2010 г., 17:50:56

Hi,
I have the following query  that needs tuning:

psrdb=# explain analyze (SELECT
psrdb(#        MAX(item_rank.rank) AS maxRank
psrdb(# FROM
psrdb(#        item_rank item_rank
psrdb(# WHERE
psrdb(#        item_rank.project_id='proj2783'
psrdb(#         AND item_rank.pf_id IS NULL
psrdb(#
psrdb(# )
psrdb-# ORDER BY
psrdb-# maxRank DESC;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------


Sort  (cost=0.19..0.19 rows=1 width=0) (actual time=12.154..12.155
rows=1 loops=1)
  Sort Key: ($0)
  Sort Method:  quicksort  Memory: 17kB
  InitPlan
    ->  Limit  (cost=0.00..0.17 rows=1 width=8) (actual
time=12.129..12.130 rows=1 loops=1)
          ->  Index Scan Backward using item_rank_rank on item_rank
(cost=0.00..2933.84 rows=17558 width=8) (actual time=12.126..12.126
rows=1 loops=1)
                Filter: ((rank IS NOT NULL) AND (pf_id IS NULL) AND
((project_id)::text = 'proj2783'::text))
  ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=12.140..12.142 rows=1 loops=1)
Total runtime: 12.206 ms
(9 rows)

I have been playing with indexes but it seems that it doesn't make any
difference. (I have created an index: item_rank_index" btree
(project_id) WHERE (pf_id IS NULL))


Any advice on how to make it run faster?

Thanks a lot,
Anne

Re: Need to increase performance of a query

От

Jesper Krogh

Дата:

10 июня 2010 г., 18:13:00

On 2010-06-10 19:50, Anne Rosset wrote:
> Any advice on how to make it run faster?

What timing do you get if you run it with \t (timing on) and without
explain analyze ?

I would be surprised if you can get it much faster than what is is.. I
may be that a
significant portion is "planning cost" so if you run it a lot you might
benefit from
a prepared statement.


--
Jesper

Re: Need to increase performance of a query

От

hubert depesz lubaczewski

Дата:

10 июня 2010 г., 18:34:07

On Thu, Jun 10, 2010 at 10:50:40AM -0700, Anne Rosset wrote:
> Any advice on how to make it run faster?

First, let me ask a simple question - what runtime for this query will
be satisfactory for you?

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

Re: Need to increase performance of a query

От

Anne Rosset

Дата:

10 июня 2010 г., 18:36:21

Jesper Krogh wrote:
> On 2010-06-10 19:50, Anne Rosset wrote:
>> Any advice on how to make it run faster?
>
> What timing do you get if you run it with \t (timing on) and without
> explain analyze ?
>
> I would be surprised if you can get it much faster than what is is.. I
> may be that a
> significant portion is "planning cost" so if you run it a lot you
> might benefit from
> a prepared statement.
>
>
Hi Jesper,
Thanks your response:
psrdb=# \timing
Timing is on.
psrdb=# (SELECT
psrdb(#        MAX(item_rank.rank) AS maxRank
psrdb(# FROM
psrdb(#        item_rank item_rank
psrdb(# WHERE
psrdb(#        item_rank.project_id='proj2783'
psrdb(#         AND item_rank.pf_id IS NULL
psrdb(#
psrdb(# )
psrdb-# ORDER BY
psrdb-# maxRank DESC;
   maxrank
-------------
 20200000000
(1 row)

Time: 12.947 ms

It really seems to me that it should take less time.

 Specially when I see the result  with a different where clause like
this one:
psrdb=# SELECT
psrdb-#        MAX(item_rank.rank) AS maxRank
psrdb-# FROM
psrdb-#        item_rank item_rank
psrdb-# WHERE
psrdb-#        item_rank.pf_id='plan1408'
psrdb-# ORDER BY
psrdb-# maxRank DESC;
   maxrank
-------------
 20504000000
(1 row)

Time: 2.582 ms


Thanks,
Anne

Re: Need to increase performance of a query

От

Anne Rosset

Дата:

10 июня 2010 г., 19:34:23

Jochen Erwied wrote:
> Thursday, June 10, 2010, 8:36:08 PM you wrote:
>
>
>> psrdb=# (SELECT
>> psrdb(#        MAX(item_rank.rank) AS maxRank
>> psrdb(# FROM
>> psrdb(#        item_rank item_rank
>> psrdb(# WHERE
>> psrdb(#        item_rank.project_id='proj2783'
>> psrdb(#         AND item_rank.pf_id IS NULL
>> psrdb(#
>> psrdb(# )
>> psrdb-# ORDER BY
>> psrdb-# maxRank DESC;
>>
>
> Don't think it does really matter, but why do you sort a resultset
> consisting of only one row?
>
>
Sorry, I should have removed the ORDER by (the full query has a union).
So without the ORDER by, here are the results:
psrdb=# SELECT
psrdb-#        MAX(item_rank.rank) AS maxRank
psrdb-# FROM
psrdb-#        item_rank item_rank
psrdb-# WHERE
psrdb-#        item_rank.pf_id='plan1408';
   maxrank
-------------
 20504000000
(1 row)

Time: 1.516 ms
psrdb=# SELECT
psrdb-#        MAX(item_rank.rank) AS maxRank
psrdb-# FROM
psrdb-#        item_rank item_rank
psrdb-# WHERE
psrdb-#        item_rank.project_id='proj2783'
psrdb-#         AND item_rank.pf_id IS NULL;
   maxrank
-------------
 20200000000
(1 row)

Time: 13.177 ms

Is there anything that can be done for the second one?

Thanks,
Anne

Re: Need to increase performance of a query

От

Kenneth Marshall

Дата:

10 июня 2010 г., 19:37:26

On Thu, Jun 10, 2010 at 12:34:07PM -0700, Anne Rosset wrote:
> Jochen Erwied wrote:
>> Thursday, June 10, 2010, 8:36:08 PM you wrote:
>>
>>
>>> psrdb=# (SELECT
>>> psrdb(#        MAX(item_rank.rank) AS maxRank
>>> psrdb(# FROM
>>> psrdb(#        item_rank item_rank
>>> psrdb(# WHERE
>>> psrdb(#        item_rank.project_id='proj2783'
>>> psrdb(#         AND item_rank.pf_id IS NULL
>>> psrdb(#
>>> psrdb(# )
>>> psrdb-# ORDER BY
>>> psrdb-# maxRank DESC;
>>>
>>
>> Don't think it does really matter, but why do you sort a resultset
>> consisting of only one row?
>>
>>
> Sorry, I should have removed the ORDER by (the full query has a union).
> So without the ORDER by, here are the results:
> psrdb=# SELECT
> psrdb-#        MAX(item_rank.rank) AS maxRank
> psrdb-# FROM
> psrdb-#        item_rank item_rank
> psrdb-# WHERE
> psrdb-#        item_rank.pf_id='plan1408';
>   maxrank
> -------------
> 20504000000
> (1 row)
>
> Time: 1.516 ms
> psrdb=# SELECT
> psrdb-#        MAX(item_rank.rank) AS maxRank
> psrdb-# FROM
> psrdb-#        item_rank item_rank
> psrdb-# WHERE
> psrdb-#        item_rank.project_id='proj2783'
> psrdb-#         AND item_rank.pf_id IS NULL;
>   maxrank
> -------------
> 20200000000
> (1 row)
>
> Time: 13.177 ms
>
> Is there anything that can be done for the second one?
>
> Thanks,
> Anne
>
What about an IS NULL index on pf_id?

Regards,
Ken

Re: Need to increase performance of a query

От

Jochen Erwied

Дата:

10 июня 2010 г., 19:39:43

Thursday, June 10, 2010, 9:34:07 PM you wrote:

> Time: 1.516 ms

> Time: 13.177 ms

I'd suppose the first query to scan a lot less rows than the second one.
Could you supply an explained plan for the fast query?

--
Jochen Erwied     |   home: jochen@erwied.eu     +49-208-38800-18, FAX: -19
Sauerbruchstr. 17 |   work: joe@mbs-software.de  +49-2151-7294-24, FAX: -50
D-45470 Muelheim  | mobile: jochen.erwied@vodafone.de       +49-173-5404164

Re: Need to increase performance of a query

От

Jochen Erwied

Дата:

10 июня 2010 г., 19:41:51

Thursday, June 10, 2010, 8:36:08 PM you wrote:

> psrdb=# (SELECT
> psrdb(#        MAX(item_rank.rank) AS maxRank
> psrdb(# FROM
> psrdb(#        item_rank item_rank
> psrdb(# WHERE
> psrdb(#        item_rank.project_id='proj2783'
> psrdb(#         AND item_rank.pf_id IS NULL
> psrdb(#
> psrdb(# )
> psrdb-# ORDER BY
> psrdb-# maxRank DESC;

Don't think it does really matter, but why do you sort a resultset
consisting of only one row?

--
Jochen Erwied     |   home: jochen@erwied.eu     +49-208-38800-18, FAX: -19
Sauerbruchstr. 17 |   work: joe@mbs-software.de  +49-2151-7294-24, FAX: -50
D-45470 Muelheim  | mobile: jochen.erwied@vodafone.de       +49-173-5404164

Re: Need to increase performance of a query

От

Anne Rosset

Дата:

10 июня 2010 г., 19:42:35

Kenneth Marshall wrote:
> On Thu, Jun 10, 2010 at 12:34:07PM -0700, Anne Rosset wrote:
>
>> Jochen Erwied wrote:
>>
>>> Thursday, June 10, 2010, 8:36:08 PM you wrote:
>>>
>>>
>>>
>>>> psrdb=# (SELECT
>>>> psrdb(#        MAX(item_rank.rank) AS maxRank
>>>> psrdb(# FROM
>>>> psrdb(#        item_rank item_rank
>>>> psrdb(# WHERE
>>>> psrdb(#        item_rank.project_id='proj2783'
>>>> psrdb(#         AND item_rank.pf_id IS NULL
>>>> psrdb(#
>>>> psrdb(# )
>>>> psrdb-# ORDER BY
>>>> psrdb-# maxRank DESC;
>>>>
>>>>
>>> Don't think it does really matter, but why do you sort a resultset
>>> consisting of only one row?
>>>
>>>
>>>
>> Sorry, I should have removed the ORDER by (the full query has a union).
>> So without the ORDER by, here are the results:
>> psrdb=# SELECT
>> psrdb-#        MAX(item_rank.rank) AS maxRank
>> psrdb-# FROM
>> psrdb-#        item_rank item_rank
>> psrdb-# WHERE
>> psrdb-#        item_rank.pf_id='plan1408';
>>   maxrank
>> -------------
>> 20504000000
>> (1 row)
>>
>> Time: 1.516 ms
>> psrdb=# SELECT
>> psrdb-#        MAX(item_rank.rank) AS maxRank
>> psrdb-# FROM
>> psrdb-#        item_rank item_rank
>> psrdb-# WHERE
>> psrdb-#        item_rank.project_id='proj2783'
>> psrdb-#         AND item_rank.pf_id IS NULL;
>>   maxrank
>> -------------
>> 20200000000
>> (1 row)
>>
>> Time: 13.177 ms
>>
>> Is there anything that can be done for the second one?
>>
>> Thanks,
>> Anne
>>
>>
> What about an IS NULL index on pf_id?
>
> Regards,
> Ken
>
Hi Ken,
I have the following index:
"item_rank_index2" btree (project_id) WHERE (pf_id IS NULL)

Are you suggesting something else?
Thanks,
Anne

Re: Need to increase performance of a query

От

Anne Rosset

Дата:

10 июня 2010 г., 19:45:04

Jochen Erwied wrote:
> Thursday, June 10, 2010, 9:34:07 PM you wrote:
>
>
>> Time: 1.516 ms
>>
>
>
>> Time: 13.177 ms
>>
>
> I'd suppose the first query to scan a lot less rows than the second one.
> Could you supply an explained plan for the fast query?
>
>
Hi Jochen,
Here is the explained plan for the fastest query:
psrdb=# explain analyze ELECT
psrdb-#        MAX(item_rank.rank) AS maxRank
psrdb-# FROM
psrdb-#        item_rank item_rank
psrdb-# WHERE
psrdb-#        item_rank.pf_id='plan1408';
ERROR:  syntax error at or near "ELECT" at character 17
psrdb=# explain analyze SELECT
psrdb-#        MAX(item_rank.rank) AS maxRank
psrdb-# FROM
psrdb-#        item_rank item_rank
psrdb-# WHERE
psrdb-#        item_rank.pf_id='plan1408';
                                                           QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=8.28..8.29 rows=1 width=8) (actual time=0.708..0.709
rows=1 loops=1)
   ->  Index Scan using item_rank_pf on item_rank  (cost=0.00..8.27
rows=1 width=8) (actual time=0.052..0.407 rows=303 loops=1)
         Index Cond: ((pf_id)::text = 'plan1408'::text)
 Total runtime: 0.761 ms
(4 rows)

Time: 2.140 ms

Re: Need to increase performance of a query

От

Craig James

Дата:

10 июня 2010 г., 19:47:34

On 6/10/10 12:34 PM, Anne Rosset wrote:
> Jochen Erwied wrote:
>> Thursday, June 10, 2010, 8:36:08 PM you wrote:
>>
>>> psrdb=# (SELECT
>>> psrdb(# MAX(item_rank.rank) AS maxRank
>>> psrdb(# FROM
>>> psrdb(# item_rank item_rank
>>> psrdb(# WHERE
>>> psrdb(# item_rank.project_id='proj2783'
>>> psrdb(# AND item_rank.pf_id IS NULL
>>> psrdb(#
>>> psrdb(# )
>>> psrdb-# ORDER BY
>>> psrdb-# maxRank DESC;
>>
>> Don't think it does really matter, but why do you sort a resultset
>> consisting of only one row?
>>
> Sorry, I should have removed the ORDER by (the full query has a union).
> So without the ORDER by, here are the results:
> psrdb=# SELECT
> psrdb-# MAX(item_rank.rank) AS maxRank
> psrdb-# FROM
> psrdb-# item_rank item_rank
> psrdb-# WHERE
> psrdb-# item_rank.pf_id='plan1408';
> maxrank
> -------------
> 20504000000
> (1 row)
>
> Time: 1.516 ms
> psrdb=# SELECT
> psrdb-# MAX(item_rank.rank) AS maxRank
> psrdb-# FROM
> psrdb-# item_rank item_rank
> psrdb-# WHERE
> psrdb-# item_rank.project_id='proj2783'
> psrdb-# AND item_rank.pf_id IS NULL;
> maxrank
> -------------
> 20200000000
> (1 row)
>
> Time: 13.177 ms
>
> Is there anything that can be done for the second one?

Postgres normally doesn't index NULL values even if the column is indexed, so it has to do a table scan when your query
includesan IS NULL condition.  You need to create an index that includes the "IS NULL" condition. 

   create index item_rank_null_idx on item_rank(pf_id)
    where item_rank.pf_id is null;

Craig

Re: Need to increase performance of a query

От

Anne Rosset

Дата:

10 июня 2010 г., 19:57:01

Craig James wrote:
> On 6/10/10 12:34 PM, Anne Rosset wrote:
>> Jochen Erwied wrote:
>>> Thursday, June 10, 2010, 8:36:08 PM you wrote:
>>>
>>>> psrdb=# (SELECT
>>>> psrdb(# MAX(item_rank.rank) AS maxRank
>>>> psrdb(# FROM
>>>> psrdb(# item_rank item_rank
>>>> psrdb(# WHERE
>>>> psrdb(# item_rank.project_id='proj2783'
>>>> psrdb(# AND item_rank.pf_id IS NULL
>>>> psrdb(#
>>>> psrdb(# )
>>>> psrdb-# ORDER BY
>>>> psrdb-# maxRank DESC;
>>>
>>> Don't think it does really matter, but why do you sort a resultset
>>> consisting of only one row?
>>>
>> Sorry, I should have removed the ORDER by (the full query has a union).
>> So without the ORDER by, here are the results:
>> psrdb=# SELECT
>> psrdb-# MAX(item_rank.rank) AS maxRank
>> psrdb-# FROM
>> psrdb-# item_rank item_rank
>> psrdb-# WHERE
>> psrdb-# item_rank.pf_id='plan1408';
>> maxrank
>> -------------
>> 20504000000
>> (1 row)
>>
>> Time: 1.516 ms
>> psrdb=# SELECT
>> psrdb-# MAX(item_rank.rank) AS maxRank
>> psrdb-# FROM
>> psrdb-# item_rank item_rank
>> psrdb-# WHERE
>> psrdb-# item_rank.project_id='proj2783'
>> psrdb-# AND item_rank.pf_id IS NULL;
>> maxrank
>> -------------
>> 20200000000
>> (1 row)
>>
>> Time: 13.177 ms
>>
>> Is there anything that can be done for the second one?
>
> Postgres normally doesn't index NULL values even if the column is
> indexed, so it has to do a table scan when your query includes an IS
> NULL condition.  You need to create an index that includes the "IS
> NULL" condition.
>
>   create index item_rank_null_idx on item_rank(pf_id)
>    where item_rank.pf_id is null;
>
> Craig
>
Hi Craig,
I tried again after adding your suggested index but I didn't see any
improvements: (seems that the index is not used)
psrdb=# explain analyze SELECT
psrdb-#        MAX(item_rank.rank) AS maxRank
psrdb-# FROM
psrdb-#        item_rank item_rank
psrdb-# WHERE
psrdb-#        item_rank.project_id='proj2783'
psrdb-#         AND item_rank.pf_id IS NULL;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.17..0.18 rows=1 width=0) (actual time=11.942..11.943
rows=1 loops=1)
   InitPlan
     ->  Limit  (cost=0.00..0.17 rows=1 width=8) (actual
time=11.931..11.932 rows=1 loops=1)
           ->  Index Scan Backward using item_rank_rank on item_rank
(cost=0.00..2933.84 rows=17558 width=8) (actual time=11.926..11.926
rows=1 loops=1)
                 Filter: ((rank IS NOT NULL) AND (pf_id IS NULL) AND
((project_id)::text = 'proj2783'::text))
 Total runtime: 11.988 ms
(6 rows)

Time: 13.654 ms


Thanks,
Anne

Re: Need to increase performance of a query

От

Heikki Linnakangas

Дата:

10 июня 2010 г., 19:57:30

On 10/06/10 22:47, Craig James wrote:
> Postgres normally doesn't index NULL values even if the column is
> indexed, so it has to do a table scan when your query includes an IS
> NULL condition.

That was addressed in version 8.3. 8.3 and upwards can use an index for
IS NULL.

I believe the NULLs were stored in the index in earlier releases too,
they just couldn't be searched for.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Need to increase performance of a query

От

Anne Rosset

Дата:

10 июня 2010 г., 20:08:37

Heikki Linnakangas wrote:
> On 10/06/10 22:47, Craig James wrote:
>> Postgres normally doesn't index NULL values even if the column is
>> indexed, so it has to do a table scan when your query includes an IS
>> NULL condition.
>
> That was addressed in version 8.3. 8.3 and upwards can use an index
> for IS NULL.
>
> I believe the NULLs were stored in the index in earlier releases too,
> they just couldn't be searched for.
>
I am using postgres 8.3.6. So why doesn't it use my index?
Thanks,
Anne

Re: Need to increase performance of a query

От

Joe Conway

Дата:

10 июня 2010 г., 20:10:18

On 06/10/2010 12:56 PM, Anne Rosset wrote:
> Craig James wrote:
>>   create index item_rank_null_idx on item_rank(pf_id)
>>    where item_rank.pf_id is null;
>>
>> Craig
>>
> Hi Craig,
> I tried again after adding your suggested index but I didn't see any
> improvements: (seems that the index is not used)

>                 Filter: ((rank IS NOT NULL) AND (pf_id IS NULL) AND
> ((project_id)::text = 'proj2783'::text))
> Total runtime: 11.988 ms
> (6 rows)
>
> Time: 13.654 ms

try:

create index item_rank_null_idx on item_rank(pf_id)
where rank IS NOT NULL AND pf_id IS NULL;

Joe

Вложения

signature.asc

Re: Need to increase performance of a query

От

Joe Conway

Дата:

10 июня 2010 г., 20:13:56

On 06/10/2010 01:10 PM, Joe Conway wrote:
> try:
>
> create index item_rank_null_idx on item_rank(pf_id)
> where rank IS NOT NULL AND pf_id IS NULL;

oops -- that probably should be:

create index item_rank_null_idx on item_rank(project_id)
where rank IS NOT NULL AND pf_id IS NULL;

Joe

Вложения

signature.asc

Re: Need to increase performance of a query

От

Anne Rosset

Дата:

10 июня 2010 г., 20:21:57

Joe Conway wrote:
> On 06/10/2010 01:10 PM, Joe Conway wrote:
>
>> try:
>>
>> create index item_rank_null_idx on item_rank(pf_id)
>> where rank IS NOT NULL AND pf_id IS NULL;
>>
>
> oops -- that probably should be:
>
> create index item_rank_null_idx on item_rank(project_id)
> where rank IS NOT NULL AND pf_id IS NULL;
>
> Joe
>
>
I tried that and it didn't make any difference. Same query plan.

Anne

Re: Need to increase performance of a query

От

Heikki Linnakangas

Дата:

10 июня 2010 г., 21:00:09

On 10/06/10 23:08, Anne Rosset wrote:
> Heikki Linnakangas wrote:
>> On 10/06/10 22:47, Craig James wrote:
>>> Postgres normally doesn't index NULL values even if the column is
>>> indexed, so it has to do a table scan when your query includes an IS
>>> NULL condition.
>>
>> That was addressed in version 8.3. 8.3 and upwards can use an index
>> for IS NULL.
>>
>> I believe the NULLs were stored in the index in earlier releases too,
>> they just couldn't be searched for.
>>
> I am using postgres 8.3.6. So why doesn't it use my index?

Well, apparently the planner doesn't think it would be any cheaper.

I wonder if this helps:

CREATE INDEX item_rank_project_id ON item_rank(project_id, rank, pf_id);

And make sure you drop any of the indexes that are not being used, to
make sure the planner doesn't choose them instead.

(You should upgrade to 8.3.11, BTW. There's been a bunch of bug-fixes
in-between, though I don't know if any are related to this, but there's
other important fixes there)

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Need to increase performance of a query

От

Joe Conway

Дата:

10 июня 2010 г., 21:06:43

On 06/10/2010 01:21 PM, Anne Rosset wrote:
>>
> I tried that and it didn't make any difference. Same query plan.

A little experimentation suggests this might work:

create index item_rank_project on item_rank(project_id, rank) where
pf_id IS NULL;

Joe

Вложения

signature.asc

Re: Need to increase performance of a query

От

Anne Rosset

Дата:

10 июня 2010 г., 21:20:32

Joe Conway wrote:
> On 06/10/2010 01:21 PM, Anne Rosset wrote:
>
>>>
>>>
>> I tried that and it didn't make any difference. Same query plan.
>>
>
> A little experimentation suggests this might work:
>
> create index item_rank_project on item_rank(project_id, rank) where
> pf_id IS NULL;
>
> Joe
>
>
Yes it does. Thanks a lot!
Anne

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Need to increase performance of a query

Вложения

Вложения

Вложения