Обсуждение: Query optimiser is not using 'not null' constraint when 'order bynulls last' clause is used
Hi,
I am using Postgres version 9.4.4 on a Mac machine. I have 2 queries that differ only in the order by clause. One of it has 'nulls last' and the other one does not have it. The performance difference between the two is considerable.
The slower of the two queries is
The slower of the two queries is
SELECT wos.notificationstatus,
wos.unrepliedcount,
wos.shownotestotech,
wos.ownerid,
wos.isfcr,
aau.user_id,
wo.workorderid AS "WOID",
wo.is_catalog_template AS "TemplateType",
wo.title AS "Title",
wo.is_catalog_template,
aau.first_name AS "Requester",
cd.categoryname AS "Category",
ti.first_name AS "Owner",
wo.duebytime AS "DueBy",
wo.fr_duetime,
wo.completedtime AS "CompletedTime",
wo.respondedtime AS "RespondedTime",
wo.resolvedtime AS "ResolvedTime",
qd.queuename AS "Group",
std.statusname AS "Status",
wo.createdtime AS "CreatedDate",
wos.isread,
wos.hasattachment,
wos.appr_statusid,
wos.priorityid,
wo.templateid AS "TemplateId",
pd.priorityid,
pd.priorityname AS "Priority",
pd.prioritycolor AS "PriorityColor",
wos.isoverdue,
wos.is_fr_overdue,
wos.linkedworkorderid,
wos.editing_status,
wos.editorid,
wos.linkedworkorderid,
wo.isparent,
sduser.isvipuser,
sduser_onbehalfof.isvipuser AS "ONBEHALFOFVIP",
wo.isparent,
wos.statusid,
sdorganization.name AS "Site",
wo.workorderid AS "RequestID"
FROM workorder wo
left join workorder_fields wof
ON wo.workorderid=wof.workorderid
left join servicecatalog_fields scf
ON wo.workorderid=scf.workorderid
left join wotoprojects wtp
ON wo.workorderid=wtp.workorderid
left join sitedefinition
ON wo.siteid=sitedefinition.siteid
left join sdorganization
ON sitedefinition.siteid=sdorganization.org_id
inner join workorderstates wos
ON wo.workorderid=wos.workorderid
left join categorydefinition cd
ON wos.categoryid=cd.categoryid
left join aaauser ti
ON wos.ownerid=ti.user_id
left join aaauser aau
ON wo.requesterid=aau.user_id
left join prioritydefinition pd
ON wos.priorityid=pd.priorityid
left join statusdefinition std
ON wos.statusid=std.statusid
left join workorder_queue wo_queue
ON wo.workorderid=wo_queue.workorderid
left join queuedefinition qd
ON wo_queue.queueid=qd.queueid
left join departmentdefinition dpt
ON wo.deptid=dpt.deptid
left join leveldefinition lvd
ON wos.levelid=lvd.levelid
left join modedefinition mdd
ON wo.modeid=mdd.modeid
left join urgencydefinition urgdef
ON wos.urgencyid=urgdef.urgencyid
left join impactdefinition impdef
ON wos.impactid=impdef.impactid
left join requesttypedefinition rtdef
ON wos.requesttypeid=rtdef.requesttypeid
left join subcategorydefinition scd
ON wos.subcategoryid=scd.subcategoryid
left join itemdefinition icd
ON wos.itemid=icd.itemid
left join servicedefinition serdef
ON wo.serviceid=serdef.serviceid
left join aaauser cbau
ON wo.createdbyid=cbau.user_id
left join aaauser oboaau
ON wo.oboid=oboaau.user_id
left join sduser
ON wo.requesterid=sduser.userid
left join sduser sduser_onbehalfof
ON wo.oboid=sduser_onbehalfof.userid
left join workorder_fields
ON wo.workorderid=workorder_fields.workorderid
WHERE ((
wos.statusid = 1)
AND (
wo.isparent = TRUE))
ORDER BY 7 DESC nulls last limit 25
On removing 'nulls last' from the order by clause the query becomes very fast. I have attached the query plan for both the queries.
From the plan it looks like the second query is able to efficiently use the workorder_pk index ( The node 'Index Scan Backward using workorder_pk on workorder' returns 25 rows) whereas the first query is not able to use the index efficiently (more than 300k rows are returned from the same node).
The column workorderid is a PK column. The query optimizer should ideally know that there is no nulls in this column and in effect there is no difference between the two queries.
I tried the same in Postgres 10 and the slower query performs much better due to parallel sequential scans but still it is less efficient than the query without 'nulls last'.
I thought it would be best to raise this with the Postgres team.
Regards,
From the plan it looks like the second query is able to efficiently use the workorder_pk index ( The node 'Index Scan Backward using workorder_pk on workorder' returns 25 rows) whereas the first query is not able to use the index efficiently (more than 300k rows are returned from the same node).
The column workorderid is a PK column. The query optimizer should ideally know that there is no nulls in this column and in effect there is no difference between the two queries.
I tried the same in Postgres 10 and the slower query performs much better due to parallel sequential scans but still it is less efficient than the query without 'nulls last'.
I thought it would be best to raise this with the Postgres team.
Regards,
Nanda
Вложения
On Thu, 2018-02-01 at 20:00 +0530, Nandakumar M wrote: > Hi, > > I am using Postgres version 9.4.4 on a Mac machine. > I have 2 queries that differ only in the order by clause. > One of it has 'nulls last' and the other one does not have it. > The performance difference between the two is considerable. > > The slower of the two queries is > > SELECT [...] > FROM workorder wo > left join workorder_fields wof > ON wo.workorderid=wof.workorderid > left join servicecatalog_fields scf > ON wo.workorderid=scf.workorderid [...] > ORDER BY 7 DESC nulls last limit 25 > > > > On removing 'nulls last' from the order by clause the query becomes very fast. > I have attached the query plan for both the queries. In the above case, the optimizer does not know that it will get the rows in the correct order: indexes are sorted ASC NULLS LAST by default, so a backwards index scan will produce the results NULLS FIRST, which is the default for ORDER BY ... DESC. If you want the nulls last, PostgreSQL has to retrieve *all* the rows and sort them rather than using the first 25 results it gets by scanning then indexes. To have the above query perform fast, add additional indexes with either ASC NULLS FIRST or DESC NULLS LAST for all used keys. Yours, Laurenz Albe
Hi,
On 2 Feb 2018 15:06, "Laurenz Albe" <laurenz.albe@cybertec.at> wrote:
>In the above case, the optimizer does >not know that it will get the rows
>in the correct order: indexes are >sorted ASC NULLS LAST by default,
>so a backwards index scan will >produce the results NULLS FIRST,
>which is the default for ORDER BY ... >DESC.
The order by column has a not null constraint on it and so nulls last or first shouldn't make any difference.
>If you want the nulls last, PostgreSQL >has to retrieve *all* the rows and sort
>them rather than using the first 25 >results it gets by scanning then >indexes.
>To have the above query perform >fast, add additional indexes with either
>ASC NULLS FIRST or DESC NULLS >LAST for all used keys.
For now this is exactly what I have done. But it is in effect a duplicate index on a PK column and I would be happy not to create it in the first place.
Regards
Nanda
Nandakumar M <m.nanda92@gmail.com> writes: > The order by column has a not null constraint on it and so nulls last or > first shouldn't make any difference. The planner does not consider this and it doesn't really seem like something worth expending cycles on. If you know that there won't be nulls in the column, why are you insisting on specifying a nondefault value of NULLS FIRST/LAST in the query? regards, tom lane
Hi, On Fri, Feb 2, 2018 at 8:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > The planner does not consider this and it doesn't really seem like > something worth expending cycles on. If you know that there won't be > nulls in the column, why are you insisting on specifying a nondefault > value of NULLS FIRST/LAST in the query? The query is generated by a framework that adds 'nulls last' to all order by clause. This is done apparently to provide common behaviour in our application irrespective of the database that is used. SQL server treats nulls as lesser than non null values which is opposite to what Postgres does. For any indexes that we create manually, we can do a --> create index on table_name(column_name nulls first); But, for the PK column we are not in control of the index that is created. Regards, Nanda
Re: Query optimiser is not using 'not null' constraint when 'order bynulls last' clause is used
От
"David G. Johnston"
Дата:
But, for the PK column we are not in control of the index that is created.
You probably can (I assume the nulls aspect of the index doesn't prevent PK usage), but you must add the PK to the table after creating the index and not let the system auto-generate the index for you.
ALTER TABLE name ADD PRIMARY KEY USING INDEX index_name;
David J.
Hi, On Fri, Feb 2, 2018 at 9:28 PM, David G. Johnston <david.g.johnston@gmail.com> wrote: > You probably can (I assume the nulls aspect of the index doesn't prevent PK > usage), but you must add the PK to the table after creating the index and > not let the system auto-generate the index for you. > > https://www.postgresql.org/docs/10/static/sql-altertable.html > > ALTER TABLE name ADD PRIMARY KEY USING INDEX index_name; > I missed to notice this in the docs. Thank you David for pointing it out. Regards, Nanda