Обсуждение: Postgres does not use indexes with OR-conditions

Поиск
Список
Период
Сортировка

Postgres does not use indexes with OR-conditions

От
arhipov
Дата:
Hello,

I have just came across interesting Postgres behaviour with
OR-conditions. Are there any chances that the optimizer will handle this
situation in the future?

select *
from commons.financial_documents fd
where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
order by fd.creation_time desc
limit 200

"Limit  (cost=4.30..44.50 rows=200 width=67) (actual time=0.032..1.376
rows=200 loops=1)"
"  ->  Index Scan Backward using financial_document_creation_time_index
on financial_documents fd (cost=4.30..292076.25 rows=1453075 width=67)
(actual time=0.027..0.683 rows=200 loops=1)"
"        Index Cond: (creation_time <= '2011-11-07
11:39:07.285022+09'::timestamp with time zone)"
"Total runtime: 1.740 ms"

select *
from commons.financial_documents fd
where fd.creation_time = '2011-11-07 10:39:07.285022+08'
    or fd.creation_time < '2011-11-07 10:39:07.285022+08'
order by fd.creation_time desc
limit 200

"Limit  (cost=4.30..71.76 rows=200 width=67) (actual
time=1067.935..1069.126 rows=200 loops=1)"
"  ->  Index Scan Backward using financial_document_creation_time_index
on financial_documents fd (cost=4.30..490104.07 rows=1453075 width=67)
(actual time=1067.927..1068.532 rows=200 loops=1)"
"        Filter: ((creation_time = '2011-11-07
11:39:07.285022+09'::timestamp with time zone) OR (creation_time <
'2011-11-07 11:39:07.285022+09'::timestamp with time zone))"
"        Rows Removed by Filter: 776785"
"Total runtime: 1069.480 ms"



Re: Postgres does not use indexes with OR-conditions

От
David Rowley
Дата:
On Fri, Nov 7, 2014 at 5:16 PM, arhipov <arhipov@dc.baikal.ru> wrote:
Hello,

I have just came across interesting Postgres behaviour with OR-conditions. Are there any chances that the optimizer will handle this situation in the future?

select *
from commons.financial_documents fd
where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
order by fd.creation_time desc
limit 200

select *
from commons.financial_documents fd
where fd.creation_time = '2011-11-07 10:39:07.285022+08'
   or fd.creation_time < '2011-11-07 10:39:07.285022+08'
order by fd.creation_time desc
limit 200

 It would certainly be possible, providing the constants compare equally, but... Question: Would you really want to pay a, say 1% increase in planning time for ALL queries, so that you could have this unique case of queries perform better at execution time?

Is there a valid reason why you don't just write the query with the <= operator?

Regards

David Rowley

Re: Postgres does not use indexes with OR-conditions

От
Vlad Arkhipov
Дата:
It was just a minimal example. The real query looks like this.

select *
from commons.financial_documents fd
where fd.creation_time < '2011-11-07 10:39:07.285022+08'
  or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and fd.financial_document_id < 100)
order by fd.creation_time desc
limit 200

I need to rewrite it in the way below to make Postgres use the index.

select *
from commons.financial_documents fd
where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
  and (
    fd.creation_time < '2011-11-07 10:39:07.285022+08'
      or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and fd.financial_document_id < 100)
  )
order by fd.creation_time desc
limit 200

On 11/07/2014 12:38 PM, David Rowley wrote:
On Fri, Nov 7, 2014 at 5:16 PM, arhipov <arhipov@dc.baikal.ru> wrote:
Hello,

I have just came across interesting Postgres behaviour with OR-conditions. Are there any chances that the optimizer will handle this situation in the future?

select *
from commons.financial_documents fd
where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
order by fd.creation_time desc
limit 200

select *
from commons.financial_documents fd
where fd.creation_time = '2011-11-07 10:39:07.285022+08'
   or fd.creation_time < '2011-11-07 10:39:07.285022+08'
order by fd.creation_time desc
limit 200

 It would certainly be possible, providing the constants compare equally, but... Question: Would you really want to pay a, say 1% increase in planning time for ALL queries, so that you could have this unique case of queries perform better at execution time?

Is there a valid reason why you don't just write the query with the <= operator?

Regards

David Rowley

Re: Postgres does not use indexes with OR-conditions

От
Andrew Dunstan
Дата:
On 11/07/2014 12:06 AM, Vlad Arkhipov wrote:
> It was just a minimal example. The real query looks like this.
>
> select *
> from commons.financial_documents fd
> where fd.creation_time < '2011-11-07 10:39:07.285022+08'
>   or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and
> fd.financial_document_id < 100)
> order by fd.creation_time desc
> limit 200
>
> I need to rewrite it in the way below to make Postgres use the index.
>
> select *
> from commons.financial_documents fd
> where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
>   and (
>     fd.creation_time < '2011-11-07 10:39:07.285022+08'
>       or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and
> fd.financial_document_id < 100)
>   )
> order by fd.creation_time desc
> limit 200
>

First, please do not top-post on the PostgreSQL lists. See
<http://idallen.com/topposting.html>

Second, the last test for fd.creation_time in your query seems
redundant. Could you not rewrite it as something this?:

    where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
         and (fd.creation_time < '2011-11-07 10:39:07.285022+08'
              or fd.financial_document_id < 100)

cheers

andrew


Re: Postgres does not use indexes with OR-conditions

От
Kevin Grittner
Дата:
Andrew Dunstan <andrew@dunslane.net> wrote:
> On 11/07/2014 12:06 AM, Vlad Arkhipov wrote:

>> I need to rewrite it in the way below to make Postgres use the index.
>>
>> select *
>> from commons.financial_documents fd
>> where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
>>  and (
>>    fd.creation_time < '2011-11-07 10:39:07.285022+08'
>>      or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and
>> fd.financial_document_id < 100)
>>  )
>> order by fd.creation_time desc
>> limit 200
>
> Could you not rewrite it as something this?:
>
>     where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
>         and (fd.creation_time < '2011-11-07 10:39:07.285022+08'
>               or fd.financial_document_id < 100)

Yeah, when there are two ways to write a query that are logically
equivalent, it is better to put the AND at the higher level than
the OR.  On the other hand, why not simply write it as?:

select *
  from commons.financial_documents fd
  where (fd.creation_time, fd.financial_document_id)
      < ('2011-11-07 10:39:07.285022+08', 100)
  order by fd.creation_time desc
  limit 200

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Postgres does not use indexes with OR-conditions

От
Tom Lane
Дата:
Kevin Grittner <kgrittn@ymail.com> writes:
> On the other hand, why not simply write it as?:

> select *
>   from commons.financial_documents fd
>   where (fd.creation_time, fd.financial_document_id)
>       < ('2011-11-07 10:39:07.285022+08', 100)
>   order by fd.creation_time desc
>   limit 200

That's the way to do it, not only because it's simpler and clearer,
but because the planner will recognize the relevance of the
condition to an index on creation_time, financial_document_id ...

            regards, tom lane


Re: Postgres does not use indexes with OR-conditions

От
David G Johnston
Дата:
Kevin Grittner-5 wrote
> Andrew Dunstan <

> andrew@

> > wrote:
>> On 11/07/2014 12:06 AM, Vlad Arkhipov wrote:
>
>>> I need to rewrite it in the way below to make Postgres use the index.
>>>
>>> select *
>>> from commons.financial_documents fd
>>> where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
>>>  and (
>>>    fd.creation_time < '2011-11-07 10:39:07.285022+08'
>>>      or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and
>>> fd.financial_document_id < 100)
>>>  )
>>> order by fd.creation_time desc
>>> limit 200
>>
>> Could you not rewrite it as something this?:
>>
>>     where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
>>         and (fd.creation_time < '2011-11-07 10:39:07.285022+08'
>>               or fd.financial_document_id < 100)
>
> Yeah, when there are two ways to write a query that are logically
> equivalent, it is better to put the AND at the higher level than
> the OR.  On the other hand, why not simply write it as?:
>
> select *
>   from commons.financial_documents fd
>   where (fd.creation_time, fd.financial_document_id)
>       < ('2011-11-07 10:39:07.285022+08', 100)
>   order by fd.creation_time desc
>   limit 200

From personal experience and observation on these lists record inequality is
not particularly intuitive.  I'm also not sure someone is likely to really
"get it" until they have a problem for which the above is the solution.

That said is there a place where we supply solutions and idioms to common
queries?  This query as well as pagination-oriented queries are two that
come to mind.  I think the material would fit well in the tutorial section
but having some kind of quick synopsis and cross reference in the
performance chapter would aid someone whose looking to solve a problem and
not in general education mode.

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Postgres-does-not-use-indexes-with-OR-conditions-tp5826027p5826065.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.