Re: Ordering of data on calls to user defined aggregate.

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Ordering of data on calls to user defined aggregate.
Дата
Msg-id 24339.1021758973@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Ordering of data on calls to user defined aggregate.  (Tim Hart <timjhart@shaw.ca>)
Ответы Re: Ordering of data on calls to user defined aggregate.
Список pgsql-general
Tim Hart <timjhart@shaw.ca> writes:
> Short version ( for the busy folk).
> Is there any guarantee of the ordering of data on calls to a user
> defined aggregate. (postgresql 7.2.1)

None whatever.

> So I tried a query like this:

> select fk, concat_with_and(name) from ( select fk, name from foo order
> by fk, name) sub_select group by fk;

The reason this isn't very reliable can be seen by looking at what it
does.  In existing releases you have to grovel through EXPLAIN VERBOSE
output to see what the sort keys are, but CVS tip is friendlier:

srf=# explain
srf-# select fk, my_agg(name) from ( select fk, name from foo order
srf(# by fk, name) sub_select group by fk;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Aggregate  (cost=122.16..129.66 rows=100 width=40)
   ->  Group  (cost=122.16..127.16 rows=1000 width=40)
         ->  Sort  (cost=122.16..124.66 rows=1000 width=40)
               Sort Key: fk
               ->  Subquery Scan sub_select  (cost=69.83..72.33 rows=1000 width=40)
                     ->  Sort  (cost=69.83..72.33 rows=1000 width=40)
                           Sort Key: fk, name
                           ->  Seq Scan on foo  (cost=0.00..20.00 rows=1000 width=40)
(8 rows)

The outer query has no idea that the inner query's output is already
sorted, so it re-sorts ... using only the specified GROUP BY key.
Unless your system's qsort is stable (which most are not), this will
not preserve the by-name ordering within groups of the same fk value.

If you weren't grouping at the outer level then the inner query's sort
order would get the job done for you.  In the presence of outer
grouping I'm not sure there's a clean way to do it.  I can think of
a hack for the case where fk/name pairs are unique:

select fk, my_agg(DISTINCT name) from foo group by fk;

This relies on the assumption that the aggregate code will implement
DISTINCT by means of sort/unique processing, which seems unlikely to
break anytime soon.  But it won't help if you want the aggregate to see
multiple values of the same name for the same fk.

There is some talk of reimplementing grouped aggregation using hash
tables, which'd eliminate the upper SORT step and thereby give the
behavior you want.  I dunno how soon anyone will get around to it
though.

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Force a merge join?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: another psql question