Обсуждение: 8.4b1: Query returning results in different order to 8.3
Hi
The following query is distilled down from a real production query for
ease of reproduction:
SELECT 1 AS id , 2 AS tmpl_id WHERE FALSE -- (in production, only rarely will this clause return a row)
UNIONSELECT* FROM (SELECT 2 AS id, 96 AS tmpl_id UNION SELECT 3 AS id, 101 AS tmpl_id ORDER BY 1 ) tmpl
WHERE tmpl_id IS NOT NULL
In 8.3 and earlier it consistently produces this result:
id | tmpl_id
----+--------- 2 | 96 3 | 101
This is the result I am expecting this query to produce ("expecting"
as in this is what I intend it to do, and it seems to work as
intended).
In 8.4beta1 the result is unpredictable; sometimes the row with id 3
is returned first, e.g.:
id | tmpl_id
----+--------- 3 | 101 2 | 96
Changing the values selected for tmpl_id produces different ordering;
on both systems tested [*], changing 101 to 102 on the third select
produces the expected ordering consistently.
SELECT 1 AS id , 2 AS tmpl_id WHERE FALSE UNIONSELECT * FROM (SELECT 2 AS id, 96 AS tmpl_id UNION
SELECT 3 AS id, 102 AS tmpl_id ORDER BY 1 ) tmpl WHERE tmpl_id IS NOT NULL
id | tmpl_id
----+--------- 2 | 96 3 | 102
Using other values produces varying results, I can't see a pattern.
Note that removing the first SELECT completely produces the expected
ordering consistently:
SELECT * FROM (SELECT 2 AS id, 96 AS tmpl_id UNION SELECT 3 AS id, 101 AS tmpl_id ORDER BY 1 ) tmpl
WHEREtmpl_id IS NOT NULL
id | tmpl_id
----+--------- 2 | 96 3 | 101
Workaround / solution to produce consistent results is to move the
"ORDER BY 1" to the main SELECT clause:
SELECT 1 AS id , 2 AS tmpl_id WHERE FALSE UNIONSELECT * FROM (SELECT 2 AS id, 96 AS tmpl_id UNION
SELECT 3 AS id, 101 AS tmpl_id ) tmpl WHERE tmpl_id IS NOT NULL ORDER BY 1
(The full version of this query in its original form is in production
on 8.2 and 8.3 versions and I am confident it has always produced
consistent results. It is used to select the appropriate template for
pages on a website and someone would have noticed long before now if
it was serving up the wrong template).
Note I'm not sure whether this is a bug, or whether the assumption
made for the original query (that the row order returned by the
subquery would be carried over to the main part of the query) is
incorrect but just happened to work as expected pre-8.4.
[*] tested on: - Ubuntu 8.10 running on VIA C7-M - OS X 10.5 running on Intel Core Duo
Regards
Ian Barwick
Ian Barwick wrote: > Note I'm not sure whether this is a bug, or whether the assumption > made for the original query (that the row order returned by the > subquery would be carried over to the main part of the query) is > incorrect but just happened to work as expected pre-8.4. The latter. Without an ORDER BY (at the outermost level), the order of the result is not well defined. Before 8.4, UNION was always performed by a Sort + Unique, which explains why the output is always sorted in previous releases. 8.4 knows how to perform it with a Hash Aggregate, which doesn't yield sorted output. You can look at the EXPLAIN output to see the difference. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Ian Barwick wrote: > > > Workaround / solution to produce consistent results is to move the > "ORDER BY 1" to the main SELECT clause: > > SELECT 1 AS id , 2 AS tmpl_id > WHERE FALSE > UNION > SELECT * FROM > (SELECT 2 AS id, 96 AS tmpl_id > UNION > SELECT 3 AS id, 101 AS tmpl_id > ) tmpl > WHERE tmpl_id IS NOT NULL > ORDER BY 1 > > (The full version of this query in its original form is in production > on 8.2 and 8.3 versions and I am confident it has always produced > consistent results. It is used to select the appropriate template for > pages on a website and someone would have noticed long before now if > it was serving up the wrong template). > > Note I'm not sure whether this is a bug, or whether the assumption > made for the original query (that the row order returned by the > subquery would be carried over to the main part of the query) is > incorrect but just happened to work as expected pre-8.4. > > > I don't believe it's a bug - the assumption is one you're not entitled to make. Your "workaround" is the correct solution, ISTM. cheers andrew
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Ian Barwick wrote:
>> Note I'm not sure whether this is a bug, or whether the assumption
>> made for the original query (that the row order returned by the
>> subquery would be carried over to the main part of the query) is
>> incorrect but just happened to work as expected pre-8.4.
> The latter. Without an ORDER BY (at the outermost level), the order of
> the result is not well defined. Before 8.4, UNION was always performed
> by a Sort + Unique, which explains why the output is always sorted in
> previous releases. 8.4 knows how to perform it with a Hash Aggregate,
> which doesn't yield sorted output.
This is mentioned in the release notes, but I suppose we'd better
promote it to the "observe the following incompatibilities" list...
regards, tom lane
On 18 Apr 2009, at 22:22, Tom Lane wrote: > > This is mentioned in the release notes, but I suppose we'd better > promote it to the "observe the following incompatibilities" list... > This is a really funny one, because people naturally expect UNION [ALL] to stay in the same order. Unlike the table, order here cannot change by inserts/updates, etc. I am sure many, even well experienced will stumble upon that one. Me is guessing, that UNION [ALL] performance just had to be improved for CTEs ? Or was it something completely separate.
On Sat, Apr 18, 2009 at 11:19 PM, Grzegorz Jaskiewicz <gj@pointblue.com.pl> wrote: > This is a really funny one, because people naturally expect UNION [ALL] to > stay in the same order. Unlike the table, order here cannot change by > inserts/updates, etc. > I am sure many, even well experienced will stumble upon that one. There is a misunderstanding here. UNION has *never* preserved the order of the subqueries before. In the OP's query it was *not* preserving the order. It was a coincidence that the order the subquery was in was sorted on the first field and since UNION resorted the whole result set by all the fields in order that meant it was in order by the first field. > Me is guessing, that UNION [ALL] performance just had to be improved for > CTEs ? Or was it something completely separate. Hash aggregates were new relative to set operations which have been around a very long time. They didn't take advantage of the new code but it was always fairly obvious that they should eventually have been changed to. I think it came along with fixing DISTINCT to use hash aggregates which was a similar situation. UNION ALL should still preserve the order of the subqueries. It just returns all the rows of each subquery one after the other with no other work. -- greg
2009/4/19 Tom Lane <tgl@sss.pgh.pa.us>
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:This is mentioned in the release notes, but I suppose we'd better
> Ian Barwick wrote:
>> Note I'm not sure whether this is a bug, or whether the assumption
>> made for the original query (that the row order returned by the
>> subquery would be carried over to the main part of the query) is
>> incorrect but just happened to work as expected pre-8.4.
> The latter. Without an ORDER BY (at the outermost level), the order of
> the result is not well defined. Before 8.4, UNION was always performed
> by a Sort + Unique, which explains why the output is always sorted in
> previous releases. 8.4 knows how to perform it with a Hash Aggregate,
> which doesn't yield sorted output.
promote it to the "observe the following incompatibilities" list...
Thanks for clarifying that. The relevant section in the release notes (which I managed to miss) is this:
It would certainly be worth an explicit mention as I imagine the previous behaviour has been consistent enough for queries to have come to rely on it.
Regards
Ian Barwick
Greg Stark wrote: > UNION ALL should still preserve the order of the subqueries. It just > returns all the rows of each subquery one after the other with no > other work. Although without an ORDER BY that's not well-defined and thus not guaranteed to work in future versions either. I doubt we're going to change that anytime soon, but I wouldn't rely on it in an application. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com