Feature-Request: Performance-Optimization when using limit in combination with order by on querys using union
От | Karsten P |
---|---|
Тема | Feature-Request: Performance-Optimization when using limit in combination with order by on querys using union |
Дата | |
Msg-id | DB7PR08MB3081636A27C077E3BB4DA2DCDA8BA@DB7PR08MB3081.eurprd08.prod.outlook.com обсуждение исходный текст |
Ответы |
Re: Feature-Request: Performance-Optimization when using limit in combination with order by on querys using union
|
Список | pgsql-general |
Hi, i've already googled so far but didn't find anything regarding my problem.. I hope i'm here at the right place. Following situation (this is just an simplyfied example): suppose we have two tables, lets say orders - column 'order_number' -> varchar - column 'order_date' -> timestamp with index on order_date and invoices - column 'invoice_number' -> varchar - column 'invoice_date' -> timestamp with index on invoice_date and many records in both if them. now we have a view combining both of them as create view documents as ( select order_number as document_number, order_date as document_date from orders union all select invoice_number, invoice_date from invoices ) finding the last order placed in the database ist really easy: select order_number from orders order by order_date desc limit 1 will result in an index scan backward on orders same with invoices only... but when querying the view select document_number from documents order by document_date desc limit 1 seems to break down to - collect all rows from orders - combine it with all rows from invoices - sort all rows (descending) - limit to one row with many data this is quite slow. I've tested this with PG9.6 and PG14, it doesn't seem to make a difference (correct me if i'm wrong). So my question is: What about optimizing the query-planner that if - a query with unions of selects is executed - and an 'order by' in combination with 'limit' is applied on the complete query (not only on subselects) - and there is a matching index for each select the order by and limit - part of the sql is also beeing applied on each sub-select ? actually select document_number from documents order by document_date desc limit 1 is beeing processed as select order_number from orders union all select invoice_number from invoices order by document_number desc limit 1 but would it be possible to let the query-optimizer expand the query to select order_number from ( (select order_number, order_date from orders order by order_date desc limit 1) union all (select invoice_number, invoice_date from invoices order by invoice_date desc limit 1) ) as subselect order by order_date desc limit 1 as this would use two (or number of unions) index-backward-scans and than only has to reorder at maximum two rows before limiting to the first of it? this should be significantly faster. thanks a lot and greetz, Karsten
В списке pgsql-general по дате отправления: