Обсуждение: Postgres does not use indexes with OR-conditions
Hello, I have just came across interesting Postgres behaviour with OR-conditions. Are there any chances that the optimizer will handle this situation in the future? select * from commons.financial_documents fd where fd.creation_time <= '2011-11-07 10:39:07.285022+08' order by fd.creation_time desc limit 200 "Limit (cost=4.30..44.50 rows=200 width=67) (actual time=0.032..1.376 rows=200 loops=1)" " -> Index Scan Backward using financial_document_creation_time_index on financial_documents fd (cost=4.30..292076.25 rows=1453075 width=67) (actual time=0.027..0.683 rows=200 loops=1)" " Index Cond: (creation_time <= '2011-11-07 11:39:07.285022+09'::timestamp with time zone)" "Total runtime: 1.740 ms" select * from commons.financial_documents fd where fd.creation_time = '2011-11-07 10:39:07.285022+08' or fd.creation_time < '2011-11-07 10:39:07.285022+08' order by fd.creation_time desc limit 200 "Limit (cost=4.30..71.76 rows=200 width=67) (actual time=1067.935..1069.126 rows=200 loops=1)" " -> Index Scan Backward using financial_document_creation_time_index on financial_documents fd (cost=4.30..490104.07 rows=1453075 width=67) (actual time=1067.927..1068.532 rows=200 loops=1)" " Filter: ((creation_time = '2011-11-07 11:39:07.285022+09'::timestamp with time zone) OR (creation_time < '2011-11-07 11:39:07.285022+09'::timestamp with time zone))" " Rows Removed by Filter: 776785" "Total runtime: 1069.480 ms"
On Fri, Nov 7, 2014 at 5:16 PM, arhipov <arhipov@dc.baikal.ru> wrote:
Hello,
I have just came across interesting Postgres behaviour with OR-conditions. Are there any chances that the optimizer will handle this situation in the future?
select *
from commons.financial_documents fd
where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
order by fd.creation_time desc
limit 200
select *
from commons.financial_documents fd
where fd.creation_time = '2011-11-07 10:39:07.285022+08'
or fd.creation_time < '2011-11-07 10:39:07.285022+08'
order by fd.creation_time desc
limit 200
It would certainly be possible, providing the constants compare equally, but... Question: Would you really want to pay a, say 1% increase in planning time for ALL queries, so that you could have this unique case of queries perform better at execution time?
Is there a valid reason why you don't just write the query with the <= operator?
Regards
David Rowley
It was just a minimal example. The real query looks like this.
select *
from commons.financial_documents fd
where fd.creation_time < '2011-11-07 10:39:07.285022+08'
or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and fd.financial_document_id < 100)
order by fd.creation_time desc
limit 200
I need to rewrite it in the way below to make Postgres use the index.
select *
from commons.financial_documents fd
where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
and (
fd.creation_time < '2011-11-07 10:39:07.285022+08'
or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and fd.financial_document_id < 100)
)
order by fd.creation_time desc
limit 200
select *
from commons.financial_documents fd
where fd.creation_time < '2011-11-07 10:39:07.285022+08'
or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and fd.financial_document_id < 100)
order by fd.creation_time desc
limit 200
I need to rewrite it in the way below to make Postgres use the index.
select *
from commons.financial_documents fd
where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
and (
fd.creation_time < '2011-11-07 10:39:07.285022+08'
or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and fd.financial_document_id < 100)
)
order by fd.creation_time desc
limit 200
On 11/07/2014 12:38 PM, David Rowley wrote:
On Fri, Nov 7, 2014 at 5:16 PM, arhipov <arhipov@dc.baikal.ru> wrote:Hello,
I have just came across interesting Postgres behaviour with OR-conditions. Are there any chances that the optimizer will handle this situation in the future?
select *
from commons.financial_documents fd
where fd.creation_time <= '2011-11-07 10:39:07.285022+08'
order by fd.creation_time desc
limit 200
select *
from commons.financial_documents fd
where fd.creation_time = '2011-11-07 10:39:07.285022+08'
or fd.creation_time < '2011-11-07 10:39:07.285022+08'
order by fd.creation_time desc
limit 200It would certainly be possible, providing the constants compare equally, but... Question: Would you really want to pay a, say 1% increase in planning time for ALL queries, so that you could have this unique case of queries perform better at execution time?Is there a valid reason why you don't just write the query with the <= operator?RegardsDavid Rowley
On 11/07/2014 12:06 AM, Vlad Arkhipov wrote: > It was just a minimal example. The real query looks like this. > > select * > from commons.financial_documents fd > where fd.creation_time < '2011-11-07 10:39:07.285022+08' > or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and > fd.financial_document_id < 100) > order by fd.creation_time desc > limit 200 > > I need to rewrite it in the way below to make Postgres use the index. > > select * > from commons.financial_documents fd > where fd.creation_time <= '2011-11-07 10:39:07.285022+08' > and ( > fd.creation_time < '2011-11-07 10:39:07.285022+08' > or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and > fd.financial_document_id < 100) > ) > order by fd.creation_time desc > limit 200 > First, please do not top-post on the PostgreSQL lists. See <http://idallen.com/topposting.html> Second, the last test for fd.creation_time in your query seems redundant. Could you not rewrite it as something this?: where fd.creation_time <= '2011-11-07 10:39:07.285022+08' and (fd.creation_time < '2011-11-07 10:39:07.285022+08' or fd.financial_document_id < 100) cheers andrew
Andrew Dunstan <andrew@dunslane.net> wrote: > On 11/07/2014 12:06 AM, Vlad Arkhipov wrote: >> I need to rewrite it in the way below to make Postgres use the index. >> >> select * >> from commons.financial_documents fd >> where fd.creation_time <= '2011-11-07 10:39:07.285022+08' >> and ( >> fd.creation_time < '2011-11-07 10:39:07.285022+08' >> or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and >> fd.financial_document_id < 100) >> ) >> order by fd.creation_time desc >> limit 200 > > Could you not rewrite it as something this?: > > where fd.creation_time <= '2011-11-07 10:39:07.285022+08' > and (fd.creation_time < '2011-11-07 10:39:07.285022+08' > or fd.financial_document_id < 100) Yeah, when there are two ways to write a query that are logically equivalent, it is better to put the AND at the higher level than the OR. On the other hand, why not simply write it as?: select * from commons.financial_documents fd where (fd.creation_time, fd.financial_document_id) < ('2011-11-07 10:39:07.285022+08', 100) order by fd.creation_time desc limit 200 -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Kevin Grittner <kgrittn@ymail.com> writes: > On the other hand, why not simply write it as?: > select * > from commons.financial_documents fd > where (fd.creation_time, fd.financial_document_id) > < ('2011-11-07 10:39:07.285022+08', 100) > order by fd.creation_time desc > limit 200 That's the way to do it, not only because it's simpler and clearer, but because the planner will recognize the relevance of the condition to an index on creation_time, financial_document_id ... regards, tom lane
Kevin Grittner-5 wrote > Andrew Dunstan < > andrew@ > > wrote: >> On 11/07/2014 12:06 AM, Vlad Arkhipov wrote: > >>> I need to rewrite it in the way below to make Postgres use the index. >>> >>> select * >>> from commons.financial_documents fd >>> where fd.creation_time <= '2011-11-07 10:39:07.285022+08' >>> and ( >>> fd.creation_time < '2011-11-07 10:39:07.285022+08' >>> or (fd.creation_time = '2011-11-07 10:39:07.285022+08' and >>> fd.financial_document_id < 100) >>> ) >>> order by fd.creation_time desc >>> limit 200 >> >> Could you not rewrite it as something this?: >> >> where fd.creation_time <= '2011-11-07 10:39:07.285022+08' >> and (fd.creation_time < '2011-11-07 10:39:07.285022+08' >> or fd.financial_document_id < 100) > > Yeah, when there are two ways to write a query that are logically > equivalent, it is better to put the AND at the higher level than > the OR. On the other hand, why not simply write it as?: > > select * > from commons.financial_documents fd > where (fd.creation_time, fd.financial_document_id) > < ('2011-11-07 10:39:07.285022+08', 100) > order by fd.creation_time desc > limit 200 From personal experience and observation on these lists record inequality is not particularly intuitive. I'm also not sure someone is likely to really "get it" until they have a problem for which the above is the solution. That said is there a place where we supply solutions and idioms to common queries? This query as well as pagination-oriented queries are two that come to mind. I think the material would fit well in the tutorial section but having some kind of quick synopsis and cross reference in the performance chapter would aid someone whose looking to solve a problem and not in general education mode. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Postgres-does-not-use-indexes-with-OR-conditions-tp5826027p5826065.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.