Re: Optimization of range queries

Поиск
Список
Период
Сортировка
От Teodor Sigaev
Тема Re: Optimization of range queries
Дата
Msg-id 28d7508c-bd3d-4597-a4c6-d0b37355a137@sigaev.ru
обсуждение исходный текст
Ответ на Optimization of range queries  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Ответы Re: Optimization of range queries  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Список pgsql-hackers
Hi!

12 years ago I proposed patch to which could "union" OR clauses into one 
range clause if it's possible. In that time pgsql could not use IS NULL 
as index clause, so patch doesn't support that

https://www.postgresql.org/message-id/flat/45742C51.9020602%40sigaev.ru

option number 4), all other are already committed.


Konstantin Knizhnik wrote:
> Hi hackers,
> 
> Postgres optimizer is not able to build efficient execution plan for the 
> following query:
> 
> explain select * from  people_raw where not ("ID"<2068113880 AND "INN" 
> is not null) and "ID"<=2068629726 AND "INN" is not null;
>                                           QUERY PLAN
> -------------------------------------------------------------------------------------------- 
> 
>   Bitmap Heap Scan on people_raw  (cost=74937803.72..210449640.49 
> rows=121521030 width=336)
>     Recheck Cond: ("ID" <= 2068629726)
>     Filter: (("INN" IS NOT NULL) AND (("ID" >= 2068113880) OR ("INN" IS 
> NULL)))
>     ->  Bitmap Index Scan on "People_pkey" (cost=0.00..74907423.47 
> rows=2077021718 width=0)
>           Index Cond: ("ID" <= 2068629726)
> (5 rows)
> 
> 
> Here the table is very large, but query effects only relatively small 
> number of rows located in the range: [2068113880,2068629726]
> But unfortunately optimizer doesn't take it into the account.
> Moreover, using "is not null" and "not null" is both operands of AND is 
> not smart:
>       (("INN" IS NOT NULL) AND (("ID" >= 2068113880) OR ("INN" IS NULL)))
> 
> If I remove "is not null" condition, then plan is perfect:
> 
> explain select * from  people_raw where not ("ID"<2068113880) and 
> "ID"<=2068629726;
>                                           QUERY PLAN
> -------------------------------------------------------------------------------------------- 
> 
>   Index Scan using "People_pkey" on people_raw  (cost=0.58..196745.57 
> rows=586160 width=336)
>     Index Cond: (("ID" >= 2068113880) AND ("ID" <= 2068629726))
> (2 rows)
> 
> Before starting  investigation of the problem, I will like to know 
> opinion and may be some advise of people familiar with optimizer:
> how difficult will be to handle this case and where to look.
> 
> Thanks in advance,
> 

-- 
Teodor Sigaev                      E-mail: teodor@sigaev.ru
                                       WWW: http://www.sigaev.ru/


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pgsql: Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.